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We  estimate  the  conditional  distribution  of  trade-to-trade  price  changes  using  ordered  pro- 
bit,  a  statistical  model  for  discrete  random  variables.  Such  an  approach  takes  into  account 
the  fact  that  transaction  price  changes  occur  in  discrete  increments,  typically  eighths 
of  a  dollcir,  and  occur  at  irregulcirly  spaced  time  intervals.  Unlike  existing  continuous- 
time/discrete-state  models  of  discrete  transaction  prices,  ordered  probit  can  capture  the 
effects  of  other  econom.ic  variables  on  price  changes,  such  as  volume,  past  price  changes, 
and  the  time  between  trades.  Using  1988  transactions  data  for  over  100  randomly  chosen 
U.S.  stocks,  we  estimate  the  ordered  probit  model  via  maximum  likelihood  and  use  the 
parameter  estimates  to  measure  several  transaction-related  quantities,  such  as  the  price 
impact  of  trades  of  a  given  size,  the  tendency  towards  price  reversals  from  one  transaction 
to  the  next,  and  the  empirical  significance  of  price  discreteness. 
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1.  Introduction. 

Common  to  virtually  all  empirical  investigations  of  the  microstructure  of  securities 
markets  is  the  need  for  a  statistical  model  of  cisset  prices  that  can  capture  the  salient  fea- 
tures of  price  movements  from  one  transaction  to  the  next.  For  example,  because  there  zu-e 
several  theories  of  why  bid/ask  spreads  exist,  a  stochastic  model  for  prices  is  a  prerequi- 
site to  empirically  decomposing  observed  spreads  into  components  due  to  order-processing 
costs,  adverse  selection,  and  specialist  market  power.^  The  benefits  and  costs  of  particular 
aspects  of  a  market's  microstructure,  such  as  margin  requirements,  the  degree  of  compe- 
tition faced  by  dealers,  the  frequency  that  orders  are  cleared,  and  intra-day  volatility  also 
depend  intimately  on  the  particular  specification  of  price  dynamics.  Even  the  event  study, 
a  tool  that  does  not  explicitly  assume  any  particular  theory  of  the  market  microstructure, 
depends  heavily  on  price  dynamics.^  In  fact,  it  is  difficult  to  imagine  an  economically 
relevant  feature  of  transaction  prices  and  the  market  microstructure  that  does  not  hinge 
on  such  price  dynamics. 

Since  stock  prices  are  perhaps  the  most  closely  watched  economic  variables  to  date, 
they  have  been  modeled  by  many  competing  specifications,  beginning  with  the  simple 
random  walk  or  Brownian  motion.  The  majority  of  such  specifications  have  been  unable 
to  capture  at  lejist  three  eispects  of  transactions  prices.  First,  on  most  U.S.  stock  exchanges 
prices  are  quoted  in  increments  of  eighths  of  a  dollar,  a  feature  not  captured  by  stochastic 
processes  with  continuous  state  spaces.  Of  course,  discreteness  is  less  problematic  for 
coarser-sampled  data,  which  may  be  well-approximated  by  a  continuous-state  process. 
But  discreteness  is  of  paramount  importance  for  intra-daily  price  movements,  since  such 
finely-sampled  price  changes  may  take  on  only  five  or  six  distinct  values.* 

Second,  another  distinguishing  feature  of  transaction  prices  is  their  timing,  which  is 
irregular  and  random.  Therefore,  such  prices  may  be  modeled  by  discrete-time  processes 
only  if  we  zu-e  prepared  to  ignore  the  information  contained  in  waiting  times  for  transac- 
tions. 

Finally,  although  many  have  computed  correlations  between  transaction  price  changes 
and  other  economic  variables,  to  date  none  of  the  existing  models  of  discrete  transaction 
prices  have  been  able  to  quantify  such  eff'ects  formally.  Such  models  have  focused  primarily 


'For  example,  lee  Glosten  and  Hairii  (1988),  Hasbrouck  (1988),  Roll  (1984),  and  Stoll  (1989). 

^See  Cohen  et  al.  (1986),  Harris,  Sofianoa,  and  Shapiro  (1990),  Haibrouck  (I991a,b),  Madhavan  and  Smidt  (1991),  and  Stoll 
and  Whaley  (1990). 

*See,  for  example,  Barclay  and  Littenberger  (1988). 

*The  implications  of  discreteness  have  been  considered  in  many  studies,  e.g.,  Cho  and  Frees  (1988),  Gottlieb  and  Kalay 
(1985),  Harris  (1989a,b,  1991),  and  Petersen  (1986). 
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on  the  unconditional  distribution  of  price  changes,  whereas  what  is  often  of  more  economic 
interest  is  the  conditional  distribution,  conditioned  on  quantities  such  as  volume,  time 
between  trades,  and  the  sequence  of  past  price  changes.  For  example,  one  of  the  unainswered 
empirical  questions  in  this  literatxire  is  what  the  total  costs  of  immediate  execution  we, 
which  mciny  take  to  be  a  mezisure  of  market  liquidity.  Perhaps  the  largest  component  of 
such  costs  is  the  price  impact  of  large  trades.  Indeed,  a  floor  broker  seeking  to  unload 
100,000  shares  of  stock  will  generally  break  up  the  sale  into  smaller  blocks  to  minimize 
the  price  impact  of  the  trades.  How  do  we  measure  price  impact?  Such  a  question  is  a 
question  about  the  conditional  distribution  of  price  changes,  conditional  upon  a  peirticular 
sequence  of  volume  and  price  changes,  i.e.,  order  flow. 

In  this  paper,  we  propose  a  specification  of  transaction  price  changes  that  addresses 
all  three  of  these  issues,  and  yet  is  still  tractable  enough  to  permit  estimation  via  standard 
techniques.  This  specification  is  known  as  ordered  probit,  a  technique  used  most  frequently 
in  cross-sectional  studies  of  dependent  variables  that  take  on  only  a  finite  number  of  values 
possessing  a  natural  ordering.^  Heuristically,  ordered  probit  analysis  is  a  generalization 
of  the  linear  regression  model  to  cases  where  the  dependent  variable  is  discrete.  As  such, 
among  the  existing  models  of  stock  price  discreteness,  ordered  probit  is  perhaps  the  only 
specification  that  can  easily  capture  the  impact  of  "explanatory"  variables  on  price  changes 
while  also  accounting  for  price  discreteness  and  irregulzir  trade  times. 

Underlying  the  analysis  is  a  "virtual"  regression  model  with  an  unobserved  continuous 
dependent  variable  Z*  whose  conditional  meaji  is  a  lineeir  function  of  observed  "explana- 
tory" variables.  Although  Z*  is  unobserved,  it  is  related  to  an  observable  discrete  random 
variable  Z,  whose  realizations  are  determined  by  where  Z*  lies  in  its  domain  or  state 
space.  By  partitioning  the  state  space  into  a  finite  number  of  distinct  regions,  Z  may  be 
viewed  as  an  indicator  function  for  Z*  over  these  regions.  For  example,  a  discrete  random 
variable  Z  taking  on  the  values  {  —  5  ,  0  ,  j  }  may  be  modeled  as  an  indicator  variable 
that  takes  on  the  value  —  g  whenever  Z*  <  ai,  the  value  0  whenever  qj  <  Z*  <  Q2>  ^^^ 
the  value  g  whenever  Z*  >  02.  Ordered  probit  analysis  consists  of  estimating  aj,  Q2  ^^^ 
the  coefficients  of  the  unobserved  regression  model  for  Z*. 

Since  Qi,  02  and  Z*  may  depend  on  a  vector  of  "regressors"  X,  ordered  probit  analysis 
is  considerably  more  general  than  its  simple  structure  suggests.  In  fact,  it  is  well  known 


'  For  example,  the  dependent  v&riable  might  be  the  level  of  education,  ai  measured  by  three  categorie*:  leu  than  high  ichool, 
high  ichool,  and  college  education.  The  dependent  variable  ij  discrete,  and  ii  naturally  ordered  lince  college  education  alwayi 
foUowi  high  ichool.  See  Maddala  (1983)  for  further  detaili. 

*See,  for  example,  Ball  (1988),  Cho  and  Frees  (1988),  Gottlieb  and  Kalay  (1985),  and  Harris  (1991). 
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that  ordered  probit  can  fit  any  arbitrary  multinomial  distribution.  However,  because  of  the 
underlying  linear  regression  framework,  ordered  probit  can  also  capture  the  price  effects  of 
many  economic  variables  in  a  way  that  models  of  the  unconditional  distribution  of  price 
changes  cannot. 

To  motivate  our  methodology  and  focus  it  on  specific  economic  issues,  we  consider 
three  questions  concerning  the  behavior  of  transaction  prices.  First,  how  does  the  par- 
ticular sequence  of  trades  affect  the  conditional  distribution  of  price  changes,  and  how 
do  these  effects  differ  across  stocks?  For  example,  does  a  sequence  of  three  consecutive 
buyer-initiated  trades  ["buys"]  generate  price  pressure,  so  that  the  next  price  change  is 
more  likely  to  be  positive  than  i^  the  sequence  were  three  consecutive  seller-initiated  trades 
("sells"],  and  how  does  this  pressure  change  from  stock  to  stock?  Second,  does  trade  size 
affect  price  changes  as  some  theories  suggest,  ajid  if  so,  what  is  the  price  impact  per  unit 
volume  of  trade  from  one  transaction  to  the  next?  Third,  does  price  discreteness  matter? 
In  particular,  can  the  conditional  distribution  of  price  changes  be  modeled  as  a  simple 
linear  regression  of  price  changes  on  explajiatory  variables  without  accounting  for  discrete- 
ness at  all?  Within  the  context  of  the  ordered  probit  framework,  we  shall  obtain  sharp 
answers  to  each  of  these  questions. 

In  Section  2  we  review  the  ordered  probit  model  and  describe  its  estimation  via  maxi- 
mum likelihood.  We  describe  the  data  in  Section  3  by  presenting  detailed  summary  statis- 
tics for  an  initial  sample  of  11  stocks.  In  Section  4  we  discuss  the  empirical  specification 
of  the  ordered  probit  model  and  the  selection  of  conditioning  or  "explanatory"  variables. 
The  maocimum  likelihood  estimates  for  our  initial  saxaple  are  reported  in  Section  5,  along 
with  some  diagnostic  specification  tests.  In  Section  6  we  use  these  maximum  likelihood 
estimates  in  three  specific  applications:  (1)  testing  for  order-flow  dependence;  (2)  measur- 
ing price  impact;  and  (3)  comparing  ordered  probit  to  simple  linear  regression.  And  as  a 
check  on  the  robustness  of  our  findings,  in  Section  7  we  present  less  detailed  results  for  a 
larger  and  randomly  chosen  sample  of  100  stocks.  We  conclude  in  Section  8. 
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2.  The  Ordered  Probit  Model. 

Consider  a  sequence  of  transaction  prices  P(fo).  ■P(^i).  P{h)y  •••»  -PC^n)  observed  at 
times  to,  ti,  t2>  •••>  *n,  and  denote  by  Zi,  Z2,  ...,  Zn  the  corresponding  price  changes, 
where  Zjt  =  P{^k)  ~  P{^k-l)  is  assumed  to  be  an  integer  multiple  of  some  divisor  called 
a  "tick"  [such  as  an  eighth  of  a  dollzu-].  Let  ZJ^  denote  an  unobservable  continuous  random 
variable  such  that: 


Zt 


k  =  Ki3  +  ^k 


E[£jfc|Xfc]   =  0 


Cjfc   i.n.i.d.   N(0,aj)  (2.1) 


where  "i.n.i.d."  indicates  that  the  ej^'s  are  independently  but  not  identically  distributed, 
and  X/f  is  a  g  X  1  vector  of  predetermined  variables  that  governs  the  conditional  mean  of 
Zt.  Note  that  subscripts  are  used  to  denote  "transaction"  time,  whereas  time  arguments 
tjt  denote  calendar  or  "clock"  time,  a  convention  we  shall  follow  throughout  the  paper. 

The  essence  of  the  ordered  probit  model  is  the  assumption  that  observed  price  changes 
Zjg  are  related  to  the  continuous  variable  ZJ^  in  the  following  manner: 


Zk     =     { 


51 
S2 


if  z*  e  Ai 
if  zj^  e  A2 


if  Z;J   G  Am 


(2.2) 


where  the  sets  A.-  form  a  partition  of  the  state  space  S*  of  Z^,  i.e.,  S*  =  (JyLi -^j  ^^^ 
AiHAj  —  0  for  t  7^  J,  and  the  5y's  axe  the  discrete  values  that  comprise  the  state  space  S  of 
Z]^.  The  motivation  for  the  ordered  probit  specification  is  to  uncover  the  mapping  between 
S*  and  S  as  a  function  of  economic  variables  or  "regressors."  In  our  current  application 
the  sy's  axe  0,  — i,  +g,  — |,  +g,  and  so  on,  and  for  simplicity  we  define  the  state-space 
pao-tition  of  S*  to  be  intervals: 


9.4 


A\     =     (-00  ,  ail 
-4- 


(2.3) 
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A2     =     {ai  ,  Q2]  (2.4) 

Ai     =     (a,_i  ,  a,]  (2.5) 

Am     =     (a^-i  ,  00)  .  (2.6) 

Although  the  observed  price  change  can  be  any  number  of  ticks,  positive  or  negative, 
we  assume  that  m  in  (2.2)  is  finite  to  keep  the  number  of  unknown  parameters  finite.  This 
poses  no  problems  since  we  may  always  let  some  states  in  S  represent  a  multiple  [and 
possibly  countably  infinite]  number  of  values  for  the  observed  price  change.  For  exajnple, 
in  our  empirical  application  we  define  si  to  be  a  price  change  of  —4  ticks  or  less,  sg  to  be 
a  price  change  of  +4  ticks  or  more,  and  62  ^°  *8  ^o  ^^  price  changes  of  —3  ticks  to  +3 
ticks  respectively.  This  parsimony  is  obtained  at  the  cost  of  losing  price  resolution  -  under 
this  specification  the  ordered  probit  model  does  not  distinguish  between  price  changes  of 
+4  and  price  changes  greater  than  +4  [since  the  +4-tick  outcome  and  the  greater  than 
+4-tick  outcome  have  been  grouped  into  a  common  event],  and  similarly  for  price  changes 
of  —4  ticks  versus  price  chzmges  less  than.  —4.  Of  course,  in  principle  the  resolution  may  be 
made  arbitrarily  finer  by  simply  introducing  more  states,  i.e.,  by  increasing  m.  However, 
in  practice  the  data  will  impose  a  limit  on  the  fineness  of  price  resolution  simply  because 
there  will  not  exist  realizations  for  the  extreme  states  when  m  is  too  large,  in  which  case 
a  subset  of  the  parameters  is  not  identified  and  cannot  be  estimated. 

Observe  that  the  £jt's  in  (2.1)  are  assumed  to  be  conditionally  independently  but  not 
identically  distributed.®  This  allows  for  clock-time  effects,  as  in  the  case  of  an  arithmetic 
Brownian  motion  where  the  variajice  a^  of  price  changes  is  linear  in  the  time  between 
trades.  We  also  allow  for  more  general  forms  of  conditional  heteroskedasticity  by  letting 
o^  depend  linearly  on  other  economic  variables  W^.,  which  differs  from  Engle's  (1982) 
ARCH  process  only  in  its  application  to  a  discrete  dependent  variable  model  requiring  an 
additional  identification  assumption  that  we  shall  discuss  below  in  Section  4. 

The  dependence  structure  of  the  observed  process  Z^  is  clearly  induced  by  that  of  Z^ 
and  the  definitions  of  the  /ly's,  since: 

''Moreover,  u  long  as  (2.1)  is  correctly  specified,  then  increasing  price  resolution  will  not  affect  the  estimated  0't  asymptot- 
ically. Of  course,  finite  sample  properties  may  differ. 

Conditional  on  the  Xk't  and  other  economic  variables  W^  influencing  the  conditional  variance  ff^.  Unless  explicitly  stated 
otherwise,  all  the  probabilities  we  deal  with  in  this  study  are  conditional  probabilities,  and  all  inferences  and  statements 
concerning  these  probabilities  are  conditional,  conditioned  on  these  variables. 
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p{Zk  =  Sj\Zk-i  =  si)   =   P{z;eAj\zl_^eAi) 


(2.7) 


As  a  consequence,  if  the  vaxiables  Xf^  and  Wf^  are  temporally  independent,  the  observed 
process  Zj^  is  also  temporally  independent.  Of  course,  these  aire  fairly  restrictive  assump- 
tions zind  are  certainly  not  necessary  for  any  of  the  statistical  inferences  that  follow.  We 
require  only  that  the  ejt's  be  conditionally  independent,  so  that  all  serial  dependence  is 
captured  by  the  X^s  and  the  W^;t's-  Consequently,  the  independence  of  the  Cjt's  ^oes  not 
imply  that  the  ZVs  are  independently  distributed  because  we  have  placed  no  restrictions 
on  the  temporal  dependence  of  the  Xjt's  or  W^s. 

The  conditional  distribution  of  observed  price  chajiges  Zj^,  conditioned  on  X^  and 
Wjf,  is  determined  by  the  partition  boundaries  and  the  particular  distribution  of  cj^..  For 
Gaussian  ef^^s,  the  conditional  distribution  is: 


P{Zk  =  s,\Xk,Wk)     =     P{XlP  +  ekeAi\Xk,Wk) 


(2.8) 


f  P{X',^p  +  ek<ai\Xk,Wk) 


if  :•  =  1 


=     {    P(  <^-l  <  ^fc/'  +  ^k<^\^h'^k)     if   1<  t  <  m        (2.9) 


^   Piam-i<Xl0  +  €k\Xk,Wk)  if  1  = 


m 


r  ^f^iz^ 


$ 


(^sf^) 


if  t  =  1 


*(^^)   -  ^("'T.''*^)     if  l<»<m  -(2.10) 


I   ^   -   K^^^S^) 


if  t  =  m 


where  $(•)  is  the  standard  normal  cumulative  distribution  function. 
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To  develop  some  intuition  for  the  ordered  probit  model,  observe  that  the  probability 
of  any  particular  observed  price  change  is  determined  by  where  the  conditional  mean  lies 
relative  to  the  partition  boundaries.  Therefore,  for  a  given  conditional  meiin  X'j^P,  shifting 
the  boundaries  will  alter  the  probabilities  of  observing  each  state  [see  Figure  l].  In  fact, 
by  shifting  the  boundaries  appropriately,  ordered  probit  can  fit  any  arbitrary  multinomial 
distribution.  This  impli^  that  the  assumption  of  normality  iinderlying  ordered  probit 
plays  no  special  role  in  determining  the  probabilities  of  states  -  a  logistic  distribution,  for 
example,  could  have  served  equally  well. 

Given  the  partition  boundaries,  a  higher  conditional  mean  X'j^P  implies  a  higher  prob- 
ability of  observing  a.  more  extreme  positive  state.  Of  course,  the  labelling  of  states  is 
axbitrary,  but  the  ordered  probit  model  makes  use  of  the  natural  ordering  of  the  states. 
The  regressors  allow  us  to  sepcirate  the  effects  of  various  economic  factors  that  influence 
the  likelihood  of  one  state  versus  another.  For  example,  suppose  that  a  large  positive 
value  of  Xi  usually  implies  a  large  negative  observed  price  change  and  vice  versa.  Then 
the  ordered  probit  coefficient  fii  will  be  negative  in  sign  and  large  in  magnitude  [relative 
to  (7jt  of  course]. 

By  allowing  the  data  to  determine  the  partition  boundaries  a,  the  coefficients  P  of  the 
conditional  mean,  and  the  canditional  variamce  a^,  the  ordered  probit  model  captures  the 
empirical  relation  between  the  unobservable  continuous  state  space  S*  and  the  observed 
discrete  state  space  5  as  a  function  of  the  economic  variables  Xj^  and  W/^. 

2.1.  Other  Models  of  Discreteness. 

From  these  observations,  it  is  apparent  that  the  rounding/eighths-barriers  models  of 
discreteness  in  Ball  (1988),  Cho  and  Frees  (1988),  Gottlieb  and  Kalay  (1985),  and  Harris 
(1991)  may  be  re-parameterized  as  ordered  probit  models.  Consider  first  the  case  of  a 
"true"  price  process  that  is  an  arithmetic  Brownian  motion,  with  trades  occurring  only 
when  this  continuous-state  process  crosses  an  eighths  threshold  [see  Cho  and  Frees  (1988)]. 
Observed  trades  from  such  a  process  may  be  generated  by  an  ordered  probit  model  in  which 
the  partition  boundaries  axe  fixed  at  multiples  of  eighths  and  the  single  regressor  is  the 
time  interval  [or  first-passage  time]  between  crossings,  appearing  in  both  the  conditional 
mean  and  variance  of  ZJ^ . 

For  the  rounding  models  of  Ball  (1988),  Gottlieb  and  Kalay  (1985),  and  Harris  (1991) 


°  However,  it  is  coniiderably  more  difficult  to  capture  conditional  heteroskedaaticity  in  the  ordered  logit  model. 
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which  do  not  make  use  of  waiting  times  between  trades,  define  the  partition  boundaries 
as  the  midpoint  between  eighths,  e.g.,  the  observed  price  change  is  g  if  the  virtual  price 
process  lies  in  the  interval  1^>  ^)i  and  omit  the  waiting  time  as  a  regressor  in  both  the 
conditional  mean  and  variance  [see  the  discussion  in  Section  6.3  below]. 

The  generality  of  the  ordered  probit  model  comes  from  the  fact  that  the  rounding  and 
eighths-barrier  models  of  discreteness  can  both  be  incorporated  by  appropriate  definitions 
of  the  partition  boundaries.  In  fact,  since  the  boundaries  may  be  pzo'ameterized  to  be  time- 
ajid  state-dependent,  ordered  probit  allows  for  more  general  kinds  of  rounding  and  eighths 
barriers.  In  addition  to  fitting  smy  arbitrary  multinomial  distribution,  ordered  probit  may 
also  accommodate  finite-state  Markov  chains  and  compound  Poisson  processes. 

Of  course,  other  models  of  discreteness  are  not  necessarily  obsolete,  since  in  several 
cases  the  parameters  of  interest  may  not  be  simple  functions  of  the  ordered  probit  param- 
eters. For  example,  a  tedious  calculation  will  show  that  although  Harris's  (1991)  rounding 
model  may  be  represented  as  aji  ordered  probit  model,  the  bid/ask  spread  parameter  c  is 
not  easily  recoverable  from  the  ordered  probit  parameters.  In  such  cases,  other  equivalent 
specifications  may  allow  more  direct  estimation  of  the  relevamt  parameters. 

2.2.  The  Likelihood  Function. 

Let  Yiif  be  an  indicator  variable  which  takes  on  the  value  1  if  the  realization  of  the  fc-th 
observation  Z^  is  the  t-th  state  5,-,  and  zero  otherwise.  Then  the  log-likelihood  function 
£  for  the  vector  of  price  changes  Z  =  \  Zi  Z2  •  •  •  Zn]',  conditional  on  the  explanatory 
variables  X  =   [  Xj  ^2  •  •  •  Xn  ]',  is  given  by: 


mX)     =     J:  n,.log$(^i^^l^)    + 


m-1 

E  ^ik  •  log 

t=2 


\  Ok  J  \  Ok  / 


+ 


Ymk  '  log 
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-*(^^-^)]  }•    (-) 


Recall  that  a^  is  a  conditional  variance,  conditioned  upon  Xj^.  This  allows  for  conditional 
heteroskedasticity  in  the  Zj^^s,  as  in  the  rounding  model  of  Cho  and  Frees  (1988)  where  the 
ZjJ's  are  increments  of  airithmetic  Brownian  motion  with  variance  proportional  to  ijfc  — ^jt-i- 
This  special  case  may  be  accommodated  by  the  specification: 

Xi/?     =     nAtk  (2.12) 

al     =     Tf^Afjb.  (2.13) 

More  generally,  we  may  also  let  a^  depend  on  other  economic  variables  PVjt  so  that: 

'^J     =     -VO   +  E-^'^'it.  (2.14) 

t=l 

There  are,  however,  some  constraints  that  must  be  placed  on  these  paj-ameters  to  achieve 
identification  since,  for  exajnple,  doubling  the  a's,  the  /?'s,  and  <Tjt  leaves  the  likelihood 
unchanged.  We  shall  return  to  this  issue  in  Section  4. 


3.  The  Data. 

The  Institute  for  the  Study  of  Securities  Markets  [ISSM]  trajisaction  database  consists 
of  time-stamped  trades  [to  the  nearest  second],  trade  size,  and  bid/aisk  quotes  from  the 
New  York  and  American  Stock  Exchanges  and  the  consolidated  regional  exchajiges  from 
Januzoy  4  to  December  30  of  1988.  Because  of  the  sheer  size  of  the  ISSM  database,  most 
empirical  studies  have  concentrated  on  more  manageable  subsets  of  the  database  and  we 
do  the  same.  But  because  there  is  so  much  data,  the  "pre-test"  or  "data-snooping"  biases 
associated  with  any  non-random  selection  procedure  used  to  obtain  the  smaller  subsets 
are  likely  to  be  substantial:  searching  for  the  largest  f-statistic  in  1,000  regressions  will 
yield  a  more  significant  [but  spurious]  finding  thzn  searching  among  only  100  regressions. 
Therefore,  how  we  choose  our  subsajnple  of  stocks  may  have  important  consequences  for 
how  our  results  are  to  be  interpreted,  so  we  shall  describe  our  procedure  in  some  detail 
here. 


Ab  &  limple  example  of  luch  a  bias,  luppoM  we  ehoie  our  lubiet  by  lelectinK  only  those  itocki  that  have  a  minimum 
of  100,000  transactions  during  1988.  This  imparts  a  downward  bias  on  our  measures  of  price  impact,  since  stocks  with  over 
100,000  trades  per  year  are  generally  more  liquid  and,  almost  by  definition,  have  smaller  price  impact. 
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We  first  began  with  an  initial  "test"  sample  of  5  stocks  that  did  not  engage  in  any 
stock  splits  or  stock  dividends  greater  than  3:2  during  1988:  Alcoa,  Allied  Signal,  Boeing, 
Dupont,  and  General  Motors.  We  restrict  splits  because  the  effects  of  price  discreteness 
to  be  captured  by  our  model  are  likely  to  change  in  important  ways  with  dramatic  shifts 
in  the  price  level.  By  eliminating  large  splits,  we  reduce  the  problem  of  large  changes  in 
the  price  level  without  screening  on  prices  directly.^^  We  also  chose  these  5  stocks  because 
they  are  relatively  large  and  visible  companies,  each  with  a  large  number  of  trades  and 
therefore  likely  to  yield  accurate  pju-ameter  estimates.  We  then  performed  the  standard 
"specification  searches"  on  these  5  stocks,  adding,  deleting,  and  transforming  regressors 
to  obtain  a  "reasonable"  fit.  By  "reasonable"  we  mean  primarily  the  convergence  of  the 
maximum  likelihood  estimation  procedure,  but  it  must  also  include  Leamer's  (1978)  kind 
of  informal  or  ad  hoc  inferences  that  all  empiricists  engage  in;  the  choices  of  specification 
that  might  have  been  affected  by  such  ad  hoc  inferences  and,  consequently,  their  potential 
biases  will  be  discussed  in  Section  4. 

Once  we  obtained  a  specification  that  was  "reasonable,"  we  estimated  this  specifica- 
tion without  further  revision  for  our  primary  sample  of  11  new  stocks,  chosen  to  yield  a 
representative  sample  with  respect  to  industries,  market  value,  price  levels,  and  sajnple 
sizes.  They  are:  International  Business  Machines  Corporation  (IBM),  Abitibi-Price  In- 
corporated (ABY),  Quantum  Chemical  Corporation  (CUE),  Dow  Chemical  Corporation 
(DOW),  First  Chicago  Corporation  (FNB),  Foster  Wheeler  Corporation  (FWC),  Handy 
and  Harman  Company  (HNH),  Navistar  International  Corporation  (NAV),  Reebok  In- 
ternational Limited  (RBK),  Sears  Roebuck  and  Company  (S),  ajid  American  Telephone 
and  Telegraph  Incorporated  (T).  By  using  the  same  specification  with  stocks  in  this  fresh 
sample,  we  sought  to  lessen  the  impact  of  any  data-snooping  biases  generated  by  our 
specification  searches  in  the  test  sample.  If,  for  example,  our  parameter  estimates  and 
subsequent  inferences  changed  dramatically  in  the  new  sample  -  in  fact,  they  did  not  - 
this  might  be  a  sign  that  our  test-sample  findings  were  driven  primarily  by  selection  biases. 

As  a  final  check  on  the  robustne«5S  of  our  specification,  we  estimate  it  for  a  larger 
sample  of  100  stocks  chosen  randomly,  and  these  companies  ase  listed  in  Table  6.  From 
this  sample,  it  was  apparent  that  our  smaller  11-stock  sample  did  suffer  from  at  least  one 
selection  bias:  it  was  comprised  of  relatively  well-known  companies.  In  contrast,  very  few 
companies  in  Table  6  were  familiar  to  us.  Despite  this  bias,  virtually  all  of  our  empirical 


"  Of  course,  if  one  were  interested  in  explaining  stock  splits,  this  procedure  would  obviously  impart  important  biases  in  the 
empirical  results. 
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findings  were  confirmed  by  this  larger  sample.  To  conserve  space  and  to  focus  attention 
on  our  findings,  we  report  the  complete  set  of  summary  statistics  and  estimation  results 
only  for  the  smaller  sample  of  11  stocks,  and  present  broader  and  less  detailed  findings  for 
the  extended  sample  afterwards. 

Of  course,  as  long  as  there  is  cross-sectional  dependence  among  the  two  samples 
it  is  impossible  to  eliminate  such  biases  completely.  Moreover,  samples  drawn  from  a 
different  time  period  are  not  necessarily  free  from  selection  bias  as  some  have  suggested, 
due  to  the  presence  of  temporal  dependence.  Unfortunately,  non-experimental  inference  is 
always  subject  to  selection  biases  of  one  kind  or  another  since  specification  searches  are  an 
unavoidable  aspect  of  genuine  progress  in  empirical  research.  Even  Bayesian  inference, 
which  is  not  as  sensitive  to  the  kinds  of  selection  biases  discussed  in  Leamer  (1978),  can 
be  distorted  in  subtle  ways  by  specification  searches.  Therefore,  beyond  our  test-sample 
procedure,  we  can  only  alert  readers  to  the  possibility  of  such  biases  and  allow  them  to 
draw  their  own  inferences. 


3.1.  Sample  Statistics. 

We  take  as  our  basic  time  series  the  intra-day  price  changes  from  trade  to  trade,  i.e., 
all  overnight  price  chemges  aie  disczo'ded.  We  do  this  because  we  wish  to  capture  the 
behavior  of  the  intra-day  price  process,  and  overnight  price  changes  are  different  enough 
to  warramt  a  separate  specification.^^  For  similar  reasons,  the  first  and  last  transaction 
prices  of  each  day  were  also  discarded  -  they  differ  systematically  from  other  prices  due 
to  institutional  features  [see  Amihud  atnd  Mendelson  (1987)  for  further  details).  Several 
other  screens  were  imposed  to  eliminate  "problem"  trades  and  quotes,  yielding  sample 
sizes  ranging  from  1,515  trades  for  ABY  to  206,794  trades  for  IBM.^^ 

Since  we  also  use  bid  and  ask  prices  in  our  analysis,  some  discussion  of  how  we  matched 


"See,  for  example,  Lo  and  MacKinlay  (1990b). 

"That  the  itatiitical  propertiea  of  overnight  price  changes  differ  eoniiderably  from  thoac  of  intra-day  price  change!  hai  been 
convincingly  documented  by  leveral  author*,  most  recently  by  Amihiid  and  MendeUrai  (1987),  Stoll  and  Whaley  (1990),  and 
Wood  et  al.  (1985). 

'^Specifically:  (1)  All  trades  flagged  with  the  following  ISSM  condition  codes  were  eliminated:  A,  C,  D,  O,  R,  and  Z  [see 
the  ISSM  documentation  for  further  details  concerning  trade  condition  codes].  (2)  Also  eliminated  were  transactions  exceeding 
3,276,000  shares  [termed  'big  trades*  by  ISSM].  (3)  Because  we  use  three  lags  of  price  changes  and  three  lags  of  6-minute  returns 
on  the  SiiV  600  index  futures  prices  as  explanatory  variable*,  we  do  not  use  the  first  three  price  changes  or  price  changes  during 
the  tint  15  minutes  of  each  day  [whichever  is  greateij  as  observations  of  the  dependent  variable.  (4)  Since  S  j:P500  future*  data 
were  not  available  on  November  10,  11,  and  the  first  2  trading  hours  of  May  3,  trade*  during  these  time*  were  also  omitted. 

Note  that  for  some  stocks,  a  small  number  of  transactions  occurred  at  prices  denominated  in  1/16's,  1/S2's  or  1/64's  of  a 
dollar  [non-NYSB  trade*].  In  these  cases,  we  rounded  the  price  randomly  [up  or  down]  to  the  nearest  1/8,  and  if  necessary,  also 
rounded  the  bid/ask  quotes  in  the  same  direction. 
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quotes  to  prices  is  required. ^^  Since  bid/ask  quotes  are  reported  on  the  ISSM  tape  only 
when  they  are  revised,  it  is  natural  to  match  each  transaction  price  to  the  most  recently 
reported  quote  prior  to  the  transaction.  However,  Lee  and  Ready  (1991)  and  others  have 
shown  that  prices  of  trades  which  precipitate  quote  revisions  are  sometimes  reported  with 
a  lag,  so  that  the  order  of  quote  revision  and  transsiction  price  is  reversed  in  official  records 
such  as  the  ISSM  tapes.  To  address  this  issue,  we  match  transaction  prices  to  quotes  that 
are  set  at  least  5  seconds  prior  to  the  transaction;  the  evidence  in  Lee  and  Ready  (1991) 
suggests  that  this  will  account  for  most  of  the  mis-sequencing. 

To  provide  some  intuition  for  this  enormous  dataset,  we  report  a  few  summary  statis- 
tics in  Table  1.  To  see  that  our  sample  of  11  stocks  contains  considerable  dispersion, 
observe  that  the  low  stock  price  reinges  from  $3,125  for  NAV  to  $104,250  for  IBM,  whereas 
the  high  ranges  from  $7,875  for  NAV  to  $129,500  for  IBM.  At  $219  million,  HNH  has  the 
smallest  market  capitalization  in  our  sample,  and  IBM  has  the  largest  with  a  market  value 
of  $69.8  billion. 

For  our  empirical  zoialysis  we  also  require  some  indicator  of  whether  a  transaction  was 
buyer-initiated  or  seller-initiated.  Obviously,  this  is  a  difficult  task  since  for  every  trade 
there  is  always  a  buyer  and  a  seller.  What  we  axe  attempting  to  measure  is  which  of  the 
two  patrties  is  more  anxious  to  consummate  the  trade,  and  is  therefore  willing  to  pay  for  it 
in  the  form  of  the  bid/ask  spread.  Perhaps  the  most  obvious  indicator  is  if  the  transaction 
occurs  at  the  ask  price  or  at  the  bid  price  -  if  it  is  the  former  then  the  trzoisaction  is  most 
likely  a  "buy,"  if  it  is  the  latter  then  the  transaction  is  most  likely  a  "sell."  Unfortunately, 
a  large  number  of  transactions  occur  at  prices  strictly  within  the  bid/cisk  spread,  so  that 
such  a  method  for  signing  trades  will  leave  the  majority  of  them  indeterminate. 

Following  Blume,  MacKinlay  and  Terker  (1989)  and  mziny  others,  we  classify  a  trans- 
action as  a  buy  if  the  transaction  price  is  higher  than  the  mean  of  the  prevailing  bid/ask 
quote  [the  most  recent  quote  that  is  set  at  least  5  seconds  prior  to  the  trade],  and  clas- 
sify it  as  a  sell  if  the  price  is  lower.  Should  the  price  equal  the  mean  of  the  prevailing 
bid/ask  quote,  we  classify  the  trade  as  an.  "indeterminate"  trade.  This  method  classifies 
far  fewer  trades  as  indeterminate  than  classifying  according  to  transactions  at  the  bid  or 
ask.^^  From  Table  1  we  see  that,  between  13  and  26  percent  of  each  stock's  transactions 


"  Quotes  implying  bid/uk  spread*  greater  than  40  ticks  or  flagged  with  the  following  ISSM  condition  codes  were  eliminated: 
C,  D,  F,  G,  I,  L,  N,  P,  S,  V,  X,  and  Z  [essentially  all  *BBO-ineligible*  quotes].  See  the  ISSM  documentation  for  further  details 
concerning  the  definitions  of  the  particular  trade  and  quote  condition  codes.  Eikeboom  (1091)  has  performed  a  thorough  study 
of  the  relative  frequencies  of  these  condition  codes  for  a  small  subset  of  the  ISSM  database. 

"  Unfortunately,  little  is  known  about  the  relative  merits  of  this  method  of  classification  versus  others  such  as  the  'tick 
test*  [which  classifies  a  transaction  as  a  buy,  a  sell,  or  indeterminate  if  its  price  is  greater  than,  less  than,  or  equal  to  the 
previous  transaction's  price,  respectively],  simply  because  it  is  virtually  impossible  to  obtain  the  data  necessary  to  evaluate 
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are  indeterminate,  and  the  remaining  trades  fall  almost  equally  into  the  two  remaining 
categories.  The  two  exceptions  are  the  two  smallest  stocks,  ABY  and  HNH.  The  former 
has  almost  twice  as  many  buys  as  sells,  whereas  the  latter  has  more  than  twice  as  msiny 
sells  as  buys. 

The  means  and  standard  deviations  of  other  variables  to  be  used  in  our  ordered  probit 
analysis  are  also  given  in  Table  1.  The  precise  definitions  of  these  variables  will  be  given 
below  in  Section  4,  but  briefly,  Z^  is  the  price  change  between  transactions  fc  —  1  and 
k,  Atjt  is  the  time  elapsed  between  these  trades,  ABjt  is  the  bid/jisk  spread  prevailing  at 
transaction  k,  SP500;t  is  the  return  on  the  S&P  500  index  futures  price  over  the  five-minute 
period  immediately  preceding  transaction  k,  IBSjt  is  the  buy /sell  indicator  described  above 
[l  for  a  buy,  —1  for  a  sell,  and  0  for  an  indeterminate  trade],  and  T^C^Jt)  ^^  *  transformation 
of  the  dollar  volume  of  transaction  k,  transformed  according  to  the  Box  and  Cox  (1964) 
specification  with  parameter  Aj  which  is  estimated  for  each  stock  t  by  maximum  likelihood 
along  with  the  other  ordered  probit  parameters. 

From  Table  1  we  see  that  for  the  larger  stocks,  trades  tend  to  occur  almost  every 
minute  on  average,  with  the  exception  of  FNB  which  has  an  average  Af)t  of  about  five 
minutes.  Of  course,  the  smaller  stocks  trade  less  frequently,  with  ABY  trading  only  once 
every  thirty  minutes  on  average.  The  median  dollar  volume  per  trade  also  varies  consider- 
ably, ranging  from  $3,000  for  the  relatively  low-priced  NAV  to  $57,400  for  the  higher-priced 
DOW. 

Finally,  Figure  2  contains  histograms  for  the  price  change,  time-between-trade,  and 
dollar  volume  variables  for  the  11  stocks.  The  histograms  of  price  changes  are  constructed 
so  that  the  most  extreme  cells  also  include  observations  beyond  them,  i.e.,  the  level  of  the 
histogram  for  the  —4  tick  cell  reflects  all  price  changes  of  —4  ticks  or  less,  and  similarly 
for  the  -1-4  ticks  cell.  Surprisingly,  these  price  histograms  are  remarkably  symmetric  across 
all  stocks.  Also,  virtually  all  the  mass  in  each  histogram  is  concentrated  in  five  or  seven 
cells  -  there  are  few  absolute  price  chzmges  of  4  ticks  or  more,  further  emphasizing  the 
importance  of  discreteness  in  transaction  prices. 

For  the  time-between-trades  and  dollar  volume  variables,  the  largest  cell,  i.e.,  1,500 
seconds  or  $200,000,  includes  all  trades  beyond  it.  As  expected,  the  histograms  for  these 
quantities  vary  greatly  according  to  market  value  and  price  level.  For  the  larger  stocks, 


these  altern&tivei.  The  only  study  we  have  seen  is  by  Robinson  (1988,  Chapter  4.4.1,  Table  19),  in  which  he  compared  the  tick 
test  rule  to  the  bid/ask  mean  rule  for  a  sample  of  196  block  trades  initiated  by  two  major  Canadian  life  insurance  companies,  and 
concluded  that  the  bid/ask  mean  rule  was  considerably  more  accurate.  Therefore,  we  adopt  this  method  of  signing  transactions. 
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the  time  between  trades  is  relatively  short,  hence  most  of  the  mass  in  those  histogramis  are 
in  the  lower-valued  cells.  But  the  histograms  of  smaller,  less  liquid  stocks  like  ABY  and 
HNH,  have  spikes  in  the  largest- valued  cell.  Histograms  for  dollar  volume  are  sometimes 
bi- modal,  as  in  the  case  of  IBM,  reQecting  both  round-lot  trading  at  100  shares  [$10,000 
on  average  for  IBM's  stock  price  during  1988]  and  some  very  large  trades,  presumably  by 
institutional  investors. 


4.  The  Empirical  Specification. 

To  estimate  the  parameters  of  the  ordered  probit  model  via  maximum  likelihood,  we 
must  first  specify:  (:)  the  number  of  states  m;  (n)  the  explanatory  variables  Xjt;  and  (m) 
the  parametrization  of  the  variance  c^. 

In  choosing  m,  we  must  balance  price  resolution  against  the  practical  constraint  that 
an  m  too  large  will  yield  no  observations  in  the  extreme  states  si  ajid  5^-  For  example, 
if  we  set  m  to  101  and  define  the  states  si  and  5ioi  symmetrically  to  be  price  chzLnges  of 
—50  ticks  and  +50  ticks  respectively,  we  would  find  no  Zjt's  among  our  11  stocks  falling 
into  these  two  states.  Using  the  histograms  in  Figure  2  as  a  guide,  we  set  m  =  9  for  the 
Izu-ger  stocks,  implying  extreme  states  of  —4  ticks  or  less  and  +4  ticks  or  more.  For  the 
three  smaller  stocks,  ABY,  FWC  eind  HNH,  we  set  m  =  5  implying  extreme  states  of  —2 
ticks  or  less  and  +2  ticks  or  more.^ 

In  selecting  the  explanatory  variables  Xjt,  we  seek  to  capture  several  aspects  of  trzuis- 
action  price  changes.  First,  we  would  like  to  allow  for  clock-time  effects,  since  there  is 
currently  some  dispute  over  whether  trade-to-trade  prices  are  stable  in  transaction  time 
versus  clock  time.  Second,  we  would  like  to  account  for  the  effects  of  the  bid/ask  spread 
on  price  changes  since  many  transactions  are  merely  movements  from  the  bid  price  to 
the  ask  price  or  vice  versa.  If,  for  example,  in  a  sequence  of  three  trades  the  first  and 
third  were  buyer-initiated  while  the  second  was  seller-initiated,  the  sequence  of  transac- 
tion prices  would  exhibit  reversals  due  solely  to  the  bid/ask  "bounce."  Third,  we  would 
like  to  measure  how  the  conditional  distribution  of  price  changes  shifts  in  response  to  a 
trade  of  a  given  volume,  i.e.,  the  price  impact  per  unit  volume  of  trade.  And  fourth,  we 
would  like  to  capture  the  effects  of  "systematic"  or  market-wide  movements  in  prices  on 
the  conditional  distribution  of  an  individual  stock's  price  changes.  To  address  these  four 


'^The  deflnition  of  states  need  not  be  symmetric  -  state  si  can  be  -6  ticks  or  less,  implying  that  state  s«  is  +2  ticks  or 
more.  However,  the  symmetry  of  the  histogram  of  price  changes  in  Figure  2  suggests  a  symmetric  definition  of  the  sy's. 
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issues,  we  first  construct  the  following  variables: 

Atigi  The  time  elapsed  between  transactions  k  —  1  and  A:,  in  seconds. 

■A-B;t_i:  The  bid/ask  spread  prevailing  at  time  tjt-i.  in  ticks. 

Zjt-/:  Three  lags  [/  =  1,  2,  3]  of  the  dependent  vairiable  Z^.   Recall  that  for 

m  =  9,  price  changes  less  then  —4  ticks  are  set  equal  to  —4  ticks  [state 
5i],  and  price  chamges  greater  than  +4  ticks  are  set  equal  to  +4  ticks 
[state  59],  and  similaxly  for  m  =  5. 

Vk-l-  Three  lags  [/  =  1,  2,  3]  of  the  dollar  volume  of  the  (fc  — /)-th  transaction, 

defined  as  tne  price  of  tne  (fc— /)-th  transaction  [in  dollars,  not  ticks]  times 
the  number  of  shares  traded  [denominated  in  lOO's  of  shares],  hence  dollar 
voltime  is  denominated  in  SlOO's  of  dollars.  To  reduce  the  influence  of 
outliers,  if  the  share  volume  of  a  trade  exceeds  the  99.5  percentile  of  the 
empirical  distribution  of  share  volume  for  that  stock,  we  set  it  equal  to 
the  99.5  percentile.^^ 

SP500;t_/:  Three  lags  [/  =  1,  2,  3]  of  5-minute  continuously  compounded  returns 
of  the  Standard  and  Poor's  500  index  futures  price,  for  the  contract 
maturing  in  the  closest  month  beyond  the  month  in  which  transaction  k— 
I  occurred,  where  the  return  is  computed  with  the  futures  price  recorded 
one  minute  before  the  nearest  round  minute  prior  to  tjt-/  ^^d  the  price 
recorded  five  minutes  before  this.  More  formally,  we  have: 

F{t7  ,  -  60) 
SP500ifc_i     =     log      '2    T  (4-1) 


m-i  ■ 

-360) 

ntu  ■ 

-360) 

nq.,  ■ 

-660) 

ntu  - 

-660) 

SP500jt_2     =     log        _' (4.2) 


'^'°°'-'  ^  '°>£;-960)  (^-^ 


where  F{t   )  is  the  S&P  500  index  futures  price  at  time  t     [measured  in 
seconds]  for  the  contract  maturing  the  closest  month  beyond  the  month 


'*  For  example,  the  09. S  percentile  for  IBM'i  ihare  volume  U  16,S00  iharei,  hence  all  IBM  trades  exceeding  16,600  iharei  are 
(et  equal  to  16,500  shares.  By  definition,  only  one  half  of  one  percent  of  the  206,794  IBM  trades  [or  1,0S4  trades]  were  "censored* 
in  this  manner.  We  chose  not  to  discard  these  trade*  because  omitting  them  could  affect  our  estimates  of  the  lag  structure, 
which  is  extremely  sensitive  to  the  tejuenee  of  trades.  For  the  10  remaining  stocks,  the  99.5  percentiles  for  share  volume  are: 
ABY=2S,600,  CUE=21,300,  DOW=2S,100.  FNB=46,200,  FWC=31,700,  HNH=20,000,  NAV=60,000,  RBK=25,000,  S=S0,0OO, 
and  T=44,100. 
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of  transaction  k  —  I,  and  t  is  the  nearest  round  minute  prior  to  time  t 
[for  example,  if  t  is  10:35:47,  then  t"  is  10:35:00].^® 

IBSjt-/:  Three  lags  [/  =  1,  2,  3]  of  an  indicator  variable  that  takes  the  value  1  if 

the  [k  —  /)-th  transaction  price  is  greater  thjin  the  average  of  the  quoted 
bid  and  ask  prices  at  time  tjfc_/,  the  value  -1  if  the  {k  —  /j-th  transaction 
price  is  less  than  the  average  of  the  bid  and  ask  prices  at  time  tk-\y  and 
0  otherwise,  i.e., 


1  if   Pk-l>\{PLl'rPl.i) 

IBSk-i     =     {        0  if  Pk.i  =  \{P^_i  +  Pl_i)   .  (4.4) 

-1  if  Pk-i< HpLi  +  Pk-l) 


Whether  the  [k  —  /)-th  transaction  price  is  closer  to  the  ask  price  or  the 
bid  price  is  one  measure  of  whether  the  trajisaction  was  buyer-initiated 
[IBS;t-/  =  l]  or  seller-initiated  [IBS;t_{  =  —  l].  If  the  transaction  price  is 
at  the  midpoint  of  the  bid  ajid  ask  prices,  the  indicator  is  indeterminate 
[IBSjfc_/  =  0]. 


Our  specification  of  X^/?  is  then  given  by  the  following  expression: 

X[0     =      PiAtk   +/?2^ib-l    +  P3Zk-2   +  P^Zk-Z   +  ;35SP500;t-i   +  /?6SP500;k_2  + 
/?7SP500;fc_3    -h   /?8lBS;t-l    +   ^9lBS;k_2    +   ^loIBSjt_3  + 
^ll{  rA(Vfc_i)  •IBSjfc.i  }    -H   /?12{  rA(Vjfc_2)  -IBSi.z  }   + 
/?13{7A(V;t-3)-IBSjt-3}   .  (4.5) 

The  variable  Atjt  is  included  in  X^  to  allow  for  clock-time  effects  on  the  conditional  mean 
oi  Z^.  Tl  prices  axe  stable  in  "transaction"  time  rather  than  clock  time,  this  coefficient 
should  be  zero.  Lagged  price  changes  are  included  to  account  for  serial  dependencies,  and 


''Thii  rather  convoluted  timing  for  computing  SP600t-i  eniurti  that  there  ia  no  temporml  overlap  between  price  ehangea 
and  the  retumi  to  the  index  futures  price.  In  particular,  we  flnt  conitruct  a  minute-by-minute  time  leriei  for  future*  prices  by 
assigning  to  each  round  minute  the  nearest  futures  transaction  price  occurring  ajltr  that  minute  but  before  the  next  [hence  if 
the  first  futures  transaction  after  10:35:00  occurs  at  10:S5:16,  the  futures  price  assigned  to  10:35:00  is  this  one].  If  no  transaction 
occurs  during  this  minute,  the  price  prevailing  at  the  previous  minute  is  assigned  to  the  current  minute.  Then  for  the  price 
change  Zj,,  we  compute  SP500j^_i  using  the  futures  price  one  minute  before  the  nearest  round  minute  prior  to  ti^^i,  and  the 
price  five  minutes  before  this  [hence  if  tk-i  is  10:36:45,  we  use  the  futures  price  assigned  to  10:35:00  and  10:30<X)  to  compute 
SP500i_i]. 
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lagged  returns  of  the  StPSOO  index  futures  price  are  included  to  account  for  market-wide 
effects  on  price  changes. 

To  measure  the  price  impact  of  a  trade  per  unit  volume  we  include  the  term  T\(Vjt_/), 
dollar  volume  transformed  according  to  the  Box  and  Cox  (1964)  specification  Tx{'): 

T,{x)     =     ^^  (4.6) 

where  A  €  [O,  l]  is  also  a  parameter  to  be  estimated.  The  Box-Cox  transformation  allows 
dollaj  volume  to  enter  into  the  conditional  mean  nonlinearly,  a  particularly  important 
innovation  since  common  intuition  suggests  that  price  impact  may  exhibit  economies  of 
scale  with  respect  to  dollar  volume  -  although  total  price  impact  is  likely  to  increase 
with  volume,  the  marginal  price  impact  probably  does  not.  The  Box-Cox  transformation 
captures  the  linear  specification  [A  =  l]  and  concave  specifications  up  to  and  including  the 
logarithmic  function  [A  =  0].  The  estimated  curvature  of  this  transformation  will  play  an 
important  role  in  the  measurement  of  price  impact. 

The  transformed  dollar  volume  variable  is  interacted  with  IBSjt_/,  an  indicator  of 
whether  the  trade  was  buyer-initiated  [IBSjt  =  l),  seller-initiated  [IBS;t  =  ~l]»  or  indeter- 
minate [IBS)t  =  0].  A  positive  Pn  would  imply  that  buyer-initiated  trades  tend  to  push 
prices  up  and  seller-initiated  trades  tend  to  drive  prices  down.  Such  a  relation  is  predicted 
by  several  information-based  models  of  trading,  e.g.,  Easley  and  O'Hara  (1987).  Moreover, 
the  magnitude  of  Pn  is  the  per-unit  volume  impact  on  the  conditional  mean  of  ZJ^,  which 
may  be  readily  translated  into  the  impact  on  the  conditional  probabilities  of  observed  price 
changes.  The  sign  and  magnitudes  of  /?i2  and  /?i3  measure  the  persistence  of  price  impact. 

To  complete  our  specification  we  must  parametrize  the  conditional  variance  a^  =7^-1- 
^li^ik  •  "^o  allow  for  clock-time  effects  we  include  Ati^,  and  since  there  is  some  evidence 
linking  bid/ask  spreads  to  the  information  content  and  volatility  of  price  changes,^^  we  also 
include  the  lagged  spread  ABj^-i-  Finally,  recall  from  Section  2.2  that  the  parameters  a, 
0,  and  7  are  unidentified  without  additional  restrictions,  hence  we  make  the  identification 
assumption  that  73  =  1-  O^r  variance  parametrization  is  then: 

ol    =    1  +  7iA«i  +  72ABifc_i  .  (4.7) 


'"See,  for  example,  Glosten  (1987),  Hubrouck  (1988,  1991»,b),  and  Peterten  and  Umlauf  (1990). 
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In  summary,  our  9-state  specification  requires  the  estimation  of  24  parameters:  the  par- 
tition boundaries  aj,  . . .,  ag,  the  variance  parameters  71  and  72>  the  coefficients  of  the 
explanatory  variables  Pi, . . .,  Pi^,  and  the  Box-Cox  pajameter  A.  The  5-state  specification 
requires  the  estimation  of  only  20  parameters. 


5.  The  Maximum  Likelihood  Estimates. 

We  compute  the  maximum  likelihood  [ML]  estimators  numerically  using  the  algorithm 
proposed  by  Berndt,  Hall,  Hall  and  Hausman  (1974),  hereafter  BHHH.  The  advantage  of 
BHHH  over  other  seajch  algorithms  is  its  reliance  on  only  first  derivatives,  an  important 
computational  consideration  for  sample  sizes  such  as  ours.  ^  The  asymptotic  covariance 
matrix  of  the  parameter  estimates  was  computed  as  the  negative  inverse  of  the  matrix  of 
[numerically  determined]  second  derivatives  of  the  log-likelihood  function  with  respect  to 
the  parameters,  evaluated  at  the  maximum  likelihood  estimates.  We  used  a  tolerance  of 
0.001  for  the  convergence  criterion  suggested  by  BHHH:  the  product  of  the  gradient  and 
the  direction  vector.  To  check  the  robustness  of  our  numerical  search  procedure,  we  used 
several  different  sets  of  starting  values  for  each  stock,  ^d  in  all  instances  our  algorithm 
converged  to  virtually  identical  parameter  estimates. 

In  Table  2a  we  report  the  ML  estimates  of  the  ordered  probit  model  for  our  11  stocks. 
Entries  in  each  of  the  columns  labelled  with  ticker  symbols  are  the  parameter  estimates 
for  that  stock,  and  to  the  immediate  right  of  each  parameter  estimate  is  the  corresponding 
2-statistic,  which  is  asymptotically  distributed  as  a  standtird  normal  variate  under  the 
null  hypothesis  that  the  coefficient  is  0,  i.e.,  it  is  the  parameter  estimate  divided  by  its 
Jisymptotic  standard  error. 

Table  2a  shows  that  the  partition  boundairies  zse  estimated  with  high  precision  for 
all  stocks.  As  expected,  the  z-statistics  are  much  larger  for  those  stocks  with  many  more 
observations.  The  parameters  for  a^  are  also  statistically  significant,  hence  homoskedastic- 
ity  may  be  rejected  at  conventional  significance  levels  -  larger  bid/ask  spreads  and  longer 
time  intervals  increase  the  conditional  volatility  of  the  disturbance. 

The  conditional  means  of  the  Z^'s  for  all  stocks  are  only  marginally  affected  by  Atjt- 


''All  computations  were  performed  in  double  prediion  in  an  ULTRIX  environment  on  a  DEC  BCXX)/200  workitation  with 
16  Mb  of  memory,  using  our  own  FORTRAN  implementation  of  the  BHHH  algorithm  with  analytical  first  derivatives.  As  a 
rough  guide  to  the  computational  demands  of  ordered  probit,  note  that  the  numerical  estimation  procedure  for  the  stock  with 
the  largest  number  of  trades  -  IBM  [206,794  trades]  -  required  only  2  hours  and  45  minutes  of  cpu  time. 
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Moreover,  the  z-statistics  are  minuscule,  especially  in  light  of  the  large  sample  sizes.  How- 
ever, as  mentioned  above,  At  does  enter  into  the  <t^  expression  significantly,  hence  clock- 
time  is  important  for  the  conditional  variances,  but  not  for  the  conditional  means  of  Zl. 
Note  that  this  does  not  necessarily  imply  the  same  for  the  conditional  distribution  of  the 
Zjt's,  which  is  nonlinearly  related  to  the  conditional  distribution  of  the  Z^J's.  For  example, 
the  conditional  mean  of  the  Zjt's  may  well  depend  on  the  conditional  variance  of  the  ZVs, 
so  that  clock-time  can  still  affect  the  conditional  mezm  of  observed  price  changes  even 
though  it  does  not  affect  the  conditional  mean  of  ZJ^. 

More  striking  is  the  significance  and  sign  of  the  lagged  price  change  coefficients  $2i 
/33,  and  P4  -  they  axe  negative  for  all  stocks,  implying  a  tendency  towards  price  reversals. 
For  example,  if  the  past  three  price  changes  were  each  1  tick,  the  conditional  mean  of  ZJ^ 

AAA 

changes  by  /32  +  /?3  +  /?4-  However,  if  the  sequence  of  price  chajiges  was  I/-I/I,  then  the 
effect  on  the  conditional  mean  is  /32  —  ^3  +  ;34,  a  quajitity  closer  to  zero  for  each  of  the 
security's  parameter  estimates. ^^ 

Note  that  these  coefficients  measure  reversal  tendencies  beyond  that  induced  by  the 
presence  of  a  constant  bid/ask  spread  as  in  Roll  (1984).  The  effect  of  this  "bid/ask  bounce" 
on  the  conditional  mean  should  be  captured  by  the  indicator  variables  IBSjt_i,  IBSjt_2, 
and  IBSjt_3.  In  the  absence  of  all  other  information  [such  as  market  movements,  past  price 
changes,  etc.],  these  variables  pick  up  any  price  effects  that  buys  and  sells  might  have  on  the 
conditional  mean.  As  expected,  the  estimated  coefficients  are  generally  negative,  indicating 
the  presence  of  reversals  due  to  movements  from  bid  to  ask  or  ask  to  bid  prices.  In  Section 
6.1  we  shall  compare  their  magnitudes  explicitly,  and  conclude  that  the  conditional  meaoi 
of  price  changes  is  path- dependent  with  respect  to  past  price  chzoiges. 

The  lagged  S&P  500  returns  are  also  significant,  but  have  a  more  persistent  effect 
on  some  securities.  For  example,  the  coefficient  for  the  first  lag  of  the  S<kP  500  is  large 
and  significant  for  DOW,  but  the  coefficients  for  the  second  and  third  are  small  and. 
insignificant.  However,  for  the  less  actively  traded  stocks  such  as  CUE,  all  three  coefficients 
are  significant  and  are  about  the  same  order  of  magnitude.  As  a  measure  of  how  quickly 
market-wide  information  is  impounded  into  prices,  these  coefficients  confirm  the  common 
intuition  that  smaller  stocks  react  more  slowly  than  larger  stocks,  which  is  consistent  with 
the  lead/lag  effects  uncovered  by  Lo  and  MacKinlay  {1990a). 


'^In  &n  earlier  specification,  in  place  of  lagged  price  change!  we  included  separate  indicator  variables  for  eight  of  the  nine 
states  of  each  lagged  price  change.  But  because  the  coefficients  of  the  indicator  variables  increased  tnonotonically  from  the 
—4  state  to  the  +4  state  [state  0  was  omitted]  in  almost  exact  proporticai  to  the  tick-change,  we  chose  the  more  parsimonious 
specification  of  including  the  actual  lagged  price  change. 
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5.1.  Diagnostics. 

A  common  diagnostic  for  the  specification  of  an  ordinary  least  squares  regression 
is  to  examine  the  properties  of  the  residuals.  If,  for  example,  a  time  series  regression 
is  well-specified,  the  residuals  should  approximate  white  noise  and  exhibit  little  serial 
correlation.  In  the  case  of  ordered  probit,  we  cannot  calculate  the  residuals  directly  since 
we  never  observe  the  latent  dependent  variable  Z^  and  therefore  cannot  compute  Z^  —  Xj^/?. 
However,  we  do  have  an  estimate  of  the  conditional  distribution  of  Z^,  conditional  on  the 
Xjt's,  based  on  the  ordered  probit  specification  and  the  maximum  likelihood  parajneter 
estimates.  From  this,  we  can  obtain  an  estimate  of  the  conditional  distribution  of  the 
ejt's  from  which  we  can  construct  generalized  residuals  ejt  along  the  lines  suggested  by 
Gourieroux  et  al.  (1985): 

ik     =     E[ek\Zk,  Xk,Wk',  0ml  ]  (5.1) 

where  dj^i  is  the  maximum  likelihood  estimator  of  the  unknown  parameter  vector  which, 
in  our  case,  contains  d,  -7,  P  and  A.  In  the  case  of  ordered  probit,  if  Zjt  is  in  the  jth 
state,  i.e.,  Zjt  =  sy,  then  the  generalized  residual  ejt  may  be  expressed  explicitly  using  the 
moments  of  the  truncated  normal  distribution  as: 


ik    =       E  [  £jfc  I  Zjk  =  sj  ,  Xk,Wk;  6^  ] 

4>{ci)  -  <^(C2) 

$(c2)  -$(ci) 


_     .        (^(ci)  -  (i>{c2)  f      . 

=     ^k-^7—^ T7-r  (5-2) 


ci     =     J- (dy_i  -  Xj^4  )  (5.3) 

'2^4-  (ay  -X'J)  (5.4) 


Ck     =     ^l  +  75Atjfc+7|AB^-i  (5.5) 

where  (^(•)  is  the  statndard  normal  probability  density  function  and  for  notational  conve- 
nience, we  define  ao  =  —00  and  am  =  +cx>.  Gourieroux  et  al.  (1985)  show  that  these 
generalized  residuals  may  be  used  to  test  for  misspecification  in  a  vaxiety  of  ways.  How- 
ever, some  care  is  required  in  performing  such  tests.    For  exajnple,  although  a  natural 
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statistic  to  calculate  is  the  first-order  autocorrelation  of  the  ejt's,  Gourieroux  et  al.  observe 
that  the  theoretical  autocorrelation  of  the  generalized  residuals  does  not  in  general  equal 
the  theoretical  autocorrelation  of  the  fjt's.  Moreover,  if  the  source  of  serial  correlation  is 
an  omitted  lagged  endogenous  variable  -  if,  for  example,  we  included  too  few  lags  of  Zj^  in 
Xk  -  then  further  refinements  of  the  usual  specification  tests  are  necessary. 

Gourieroux  et  al.  (1985)  derive  valid  tests  for  serial  correlation  from  lagged  endogenous 
variables  using  the  score  statistic,  essentially  the  derivative  of  the  likelihood  function  with 
respect  to  an  autocorrelation  pzirameter,  evaluated  at  the  maximum  likelihood  estimates 
under  the  null  hypothesis  of  no  serial  correlation.  More  specifically,  consider  the  following 
model  for  our  ZJ^: 

K     =    V?^;_i   +  X'^H  +  £jfc         ,         bl  <  1  .  (5.6) 

In  this  case,  the  score  statistic  |i  is  the  derivative  of  the  likelihood  function  with  respect 
to  fp  evaluated  at  the  maximum  likelihood  estimates,  eind  under  the  null  hypothesis  that 
V?  =  0  it  simplifies  to  the  following  expression: 

Where        4    =    E  [  Z^*  |  Z;fc  ,  X;fc  ,  W^jt  ;  B^  ]  (5.8) 

=    ^fc^  +  h  .  (5.9) 

When  V'  =  0.  Ii  is  asymptotically  distributed  as  a  Xi  variate.  More  generally,  we  caoi  test 
the  higher-order  specification: 

Zl     =     <pZ*k_^  H-  4^  -H  £jb         »  bl  <  1  (5.10) 

by  using  the  score  statistic  fy: 


En  72      ;2 
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^i     ^       Vn  ^2      J  (5.11) 


which  is  also  asymptotically  Xi  under  the  null  hypothesis  tp  =  0.  For  further  intuition, 
we  can  compute  the  sample  correlation  Oj  of  the  generalized  residual  tjt  with  the  lagged 
generalized  fitted  values  ^jfc-y  -  under  the  null  hypothesis  of  no  serial  correlation  in  the 
tjt's,  the  theoretical  value  of  this  correlation  is  0,  hence  the  sample  correlation  will  provide 
one  measure  of  the  economic  impact  of  misspecification. 

In  Table  2b  we  report  the  first  twelve  autocorrelations  of  the  generalized  residuals  {cj^} 
for  our  sample  of  11  stocks.  Although  they  are  generally  small,  recall  that  they  converge 
asymptotically  to  population  values  that  need  not  equal  the  theoretical  autocorrelations 
of  the  disturbances  {cjt}'  Moreover,  in  the  presence  of  lagged  endogenoxis  variables,  they 
are  biased  towards  0.  In  Table  2c,  the  correlations  i/y,  j  =  1, . . . ,  12  which  are  not  biased 
towards  0  are  reported,  and  they  are  also  generally  small. 

Finally,  Table  2d  reports  the  score  statistics  fy,  j  =  1, . . . ,  12.  Since  we  have  included 
three  lags  of  Zj^  in  our  specification  of  Xjt,  it  is  no  surprise  that  none  of  the  score  statistics 
for  J  =  1,  2  ajid  3  are  statistically  significant  at  the  5  percent  level.  However,  at  lag  4, 
the  score  statistics  for  all  stocks  except  ABY,  CUE  and  HNH  are  significant,  indicating 
the  presence  of  some  serial  dependence  not  accounted  for  by  our  specification.  But  recall 
that  we  have  very  large  sample  sizes  so  that  virtually  any  point  null  hypothesis  will  be 
rejected.  With  this  in  mind,  the  score  statistics  seem  to  indicate  a  reasonably  good  fit 
for  all  but  one  stock:  NAV.  Its  score  statistic  is  significant  at  every  lag,  suggesting  the 
need  for  re-specification.  Turning  back  to  the  cross-autocorrelations  reported  in  Table  2c, 
we  see  that  NAV's  residual  €](.  has  a  —0.088  correlation  with  Zj(._4,  the  largest  in  Table 
2c  in  absolute  value.  This  suggests  that  adding  Zj^^^  as  a  regressor  might  improve  the 
specification. 

There  are  of  course  a  number  of  other  specification  tests  that  can  check  the  robustness 
of  the  ordered  probit  specification,  and  they  should  be  performed  with  an  eye  towards 
particular  applications.  For  example,  when  studying  the  impact  of  information  vziriables 
on  volatility,  a  more  pressing  concern  would  be  the  specification  of  the  conditional  variance 
a^.  If  some  of  parameters  have  important  economic  interpretations,  their  stability  can  be 
checked  by  simple  likelihood  ratio  tests  on  subsamples  of  the  data.  And  if  forecasting  price 
chainges  is  of  interest,  an  i2^-like  measure  can  readily  be  constructed  to  measure  how  much 
variability  can  be  explained  by  the  predictors.  The  ordered  probit  model  is  flexible  enough 
to  accommodate  virtually  any  specification  test  designed  for  simple  regression  models,  but 
has  many  obvious  advantages  over  OLS  as  we  shall  see  below. 
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5.2.  Endogeneity  of  Atjt  and  IBS^f 

Our  inferences  in  the  preceding  sections  are  based  on  the  implicit  assumption  that  the 
explanatory  variables  AT^  are  all  exogenous  or  predetermined  with  respect  to  the  dependent 
variable  Z^.  However,  the  vziriable  Afjt  is  contemporaneous  to  Zjt  and  deserves  further 
discussion. 

Recall  that  Zjt  is  the  price  change  between  trades  at  time  t;t_i  and  time  f;f  Since  Af^ 
is  simply  tk~^k-\i  '*  may  well  be  that  Afjt  and  Zjt  are  determined  simultaneously,  in  which 
case  our  parajneter  estimates  are  generally  inconsistent.  In  fax;t,  there  are  several  plausible 
arguments  for  the  endogeneity  of  Af  jj..  One  such  argument  turns  on  the  tendency  of  floor 
brokers  to  break  up  large  trades  into  smaller  ones,  and  time  the  executions  carefully  during 
the  course  of  the  day  or  several  days.  By  "Svorking"  the  order,  the  floor  broker  can  minimize 
the  price  impact  of  his  trades  and  obtain  more  favorable  execution  prices  for  his  clients. 
But  by  selecting  the  times  between  his  trades  based  on  current  market  conditions,  which 
include  information  also  affecting  price  changes,  the  floor  broker  is  creating  endogenous 
trade  times. 

However,  ajiy  given  sequence  of  trades  in  our  dataset  does  not  necessarily  correspond 
to  consecutive  trajisactions  of  any  single  individual  [other  thain  the  specialist  of  course], 
but  is  the  result  of  many  buyers  and  sellers  interacting  with  the  specialist.  For  example, 
even  if  a  floor  broker  were  working  a  large  order,  in  between  his  orders  might  be  purchases 
and  sales  from  other  floor  brokers,  market  orders,  and  triggered  limit  orders.  Therefore, 
the  Atjt's  also  reflect  these  trades,  which  are  not  necessarily  information-motivated. 

Another  more  intriguing  rezison  that  Af;t  °^^y  ^^  exogenous  is  that  floor  brokers 
have  an  economic  incentive  to  minimize  the  correlation  between  Afjt  and  virtually  all 
other  exogenous  and  predetermined  variables.  To  see  this,  suppose  the  floor  broker  timed 
his  trades  in  response  to  some  exogenous  variable  also  affecting  price  changes,  call  it 
"weather."  Suppose  that  price  changes  tend  to  be  positive  in  good  weather  and  negative 
in  bad  weather.  Knowing  this,  the  floor  broker  will  wait  until  bad  weather  prevails  before 
buying,  hence  trade  times  ajid  price  changes  axe  simultaneously  determined  by  weather. 
However,  if  other  traders  are  also  aware  of  these  relations,  they  can  garner  information 
about  the  floor  broker's  intent  by  watching  his  trades  and  by  recording  the  weather,  and 
trade  against  him  successfully.  To  prevent  this,  the  floor  broker  miist  trade  to  deliberately 
minimize  the  correlation  between  his  trade  times  and  the  weather.    As  such,  the  floor 


''See,  for  example,  Admati  and  Pfleiderer  (1988,  1989)  and  Easley  and  O'Hara  (1990) 
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broker  has  an  economic  incentive  to  reduce  simultaneous  equations  bias!  Moreover,  this 
argument  applies  to  any  other  economic  variable  that  can  be  used  to  jointly  forecaist  trade 
times  and  price  changes.  For  these  two  reasons,  we  assume  that  Afjt  is  exogenous. ^^ 

6.  Applications. 

In  applying  our  parameter  estimates  to  specific  issues  of  the  msirket  microstructure,  we 
must  first  consider  how  to  interpret  the  ordered  probit  model  from  an  economic  perspective. 
Since  ordered  probit  may  be  viewed  as  a  generalization  of  a  linear  regression  model  to 
situations  with  a  discrete  dependent  variable,  interpreting  its  parameter  estimates  is  much 
like  interpreting  coefficients  of  a  linear  regression  -  the  particular  interpretation  depends 
critically  on  the  underlying  economic  motivation  for  including  and  excluding  particular 
regressors.  In  a  very  few  instances,  theoretical  paradigms  might  yield  testable  implications 
in  the  form  of  linear  regression  equations,  e.g.,  the  CAPM's  security  market  line.  However, 
linear  regression  is  more  often  used  as  a  means  of  capturing  and  sununaxizing  empirical 
relations  in  the  data  that  have  not  yet  been  derived  from  economic  first  principles. 

In  much  the  same  way,  ordered  probit  may  be  interpreted  as  a  means  of  capturing  and 
summarizing  relations  among  price  changes  and  other  economic  variables  such  as  volume. 
Such  relations  have  been  derived  from  first  principles  only  in  the  most  simplistic  and 
stylized  of  contexts,  under  very  specific  and,  therefore,  often  counterfactual  assumptions 
about  agents'  preferences,  information  sets,  alternative  investment  possibilities,  sources  of 
uncertainty  ajid  their  parametric  form  [usually  Gaussian],  and  the  timing  eind  allowable 
volume  and  type  of  trades.  Although  such  models  do  yield  invaluable  insights  about  the 
economics  of  the  market  microstructure,  they  are  too  easily  rejected  by  the  data  because  of 
the  many  restrictive  assumptions  needed  to  obtain  easily  interpretable  closed-form  results. 

And  yet  the  broader  implications  of  such  models  can  still  be  "tested"  by  checking  for 
simple  relations  among  economic  quantities,  as  we  illustrate  in  Section  6.1.  But  some  care 
m\ist  be  taJcen  in  interpreting  such  results,  as  in  the  case  of  a  simple  linear  regression  of 
prices  on  quantities,  which  cannot  be  interpreted  as  an  estimated  demand  curve  without 
imposing  additional  economic  structure. 

'*We  have  alao  explored  iome  adjuitmenti  for  the  endogeneity  of  At^  along  the  lines  of  Hauiman  (1978)  and  Newey  (1985), 
and  our  preliminary  estimates  show  that  although  cxogeneity  of  AU  may  be  rejected  at  conventional  significance  levels  [recall 
our  sample  siiesj,  the  estimates  do  not  change  much  once  endogeneity  is  accounted  for  fay  an  instrumental  variables  estimation 
procedure. 

^'Just  a  few  examples  of  this  growing  literature  are  Amihud  and  Mendelson  (1980)  Admati  and  Pfleiderer  (1988,  1989), 
Eaaley  and  O'Hara  (1987),  Carman  (1976),  Glosten  and  Milgrom  (1985),  Ho  and  Stoll  (1980,  1981),  Kyle  (1985),  StoU  (1989), 
and  Wang  (1991). 
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In  particular,  although  the  ordered  probit  model  can  shed  light  on  how  price  changes 
respond  to  specific  economic  variables,  it  cannot  give  us  economic  insights  beyond  whatever 
structure  we  choose  to  impose  a  priori.  For  example,  since  we  have  placed  no  specific 
theoretical  structure  on  how  prices  are  formed,  our  ordered  probit  estimates  cannot  yield 
sharp  implications  for  the  impact  of  floor  brokers  "working"  aji  order  [executing  a  large 
order  in  smaller  bundles  to  obtain  the  best  average  price].  The  ordered  probit  estimates 
will  reflect  the  combined  actions  and  interactions  of  these  floor  brokers,  the  specialists, 
and  individual  and  institutional  investors  all  trading  among  each  other.  Unless  we  are 
estimating  a  fully  articulated  model  of  economic  equilibrium  that  contains  these  kinds 
of  market  participants,  we  cannot  separate  their  individual  impact  in  determining  price 
changes.  For  example,  without  additional  structure  we  cannot  answer  the  question:  What 
is  the  price  impact  of  an  order  that  is  not  "worked"? 

However,  if  we  were  able  to  identify  those  large  trades  that  did  benefit  from  the 
services  of  a  floor  broker,  we  could  certainly  compare  and  contrast  their  empirical  price 
dynamics  with  those  of  "un-worked"  trades  using  the  ordered  probit  model.  And  such 
comparisons  might  provide  additional  guidelines  and  restrictions  for  developing  new  the- 
ories of  the  market  microstructure.  Interpreted  in  this  way,  the  ordered  probit  model 
can  be  a  valuable  tool  for  uncovering  empirical  relations  even  in  the  absence  of  a  highly 
parametrized  theory  of  the  mzirket  microstructure.  To  illustrate  this  aspect  of  ordered 
probit,  in  the  following  section  we  consider  three  specific  applications  of  the  parameter 
estimates  of  Section  5:  a  test  for  order-flow  dependence  in  price  changes,  a  measure  of 
price  impact,  and  a  comparison  of  ordered  probit  to  ordinary  least  squares. 


6.1.  Order-Flow  Dependence. 

Several  recent  theoretical  papers  in  the  market  microstructure  literature  have  shown 
the  importance  of  information  in  determining  relations  between  prices  and  trade  size.  In 
particular,  Eeisley  and  O'Haxa  (1987)  observe  that  because  informed  traders  prefer  to  trade 
larger  amounts  than  uninformed  liquidity  traders,  the  size  of  a  trade  contains  information 
about  who  the  trader  is  and,  consequently,  also  contains  information  about  the  traders' 
private  information.  As  a  result,  prices  in  their  model  do  not  satisfy  the  Markov  property 
-  the  conditional  distribution  of  next  period's  price  depends  on  the  entire  history  of  past 
prices,  i.e.,  on  the  order  flow.  That  is,  the  sequence  of  price  changes  of  1/-1/1  will  have 
a  diff'erent  effect  on  the  conditional  mean  than  the  sequence  -1/1/1,  even  though  both 
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sequences  yield  the  same  total  price  change  over  the  three  trades. 

One  simple  implication  of  such  order-flow  dependence  is  that  the  coefficients  of  the 
three  lags  of  Zjt's  are  not  identical  -  if  they  are,  then  only  the  sum  of  the  most  recent  three 
price  changes  matters  in  determining  the  conditional  mean,  and  not  the  order  in  which 
those  price  changes  occurred.  Therefore,  if  we  denote  by  /?p  the  vector  of  coefficients 
[  02  03  04  ]'  of  th^  lagged  price  changes,  the  null  hypothesis  H  of  order-flow  independence 
is  simply: 


H:         02     =    03     =    0i' 


This  may  be  re-cast  as  a  linear  hypothesis  for  0p,  namely  A0p  =  0  where: 


-  =  (J  1  -I)  ■  (") 


Then  under  H,  we  obtain  the  following  test  statistic: 


0lA'{AVpA'r''A0p     ^    xl  (6.2) 

where  Vp  is  the  estimated  asymptotic  covariance  matrix  of  0p.  The  values  of  these  test 
statistics  for  the  11  stocks  are:  IBM=11,462.43,  ABY=2.17,  CUE=152.05,  DOW=2,666.13, 
FNB=661.01,  FWC=446.01,  HNH=18.62,  NAV=1,184.48,  RBK=2,708.89,  3=3,854.62, 
and  T=3,428.92.  The  null  hypothesis  of  order-flow  independence  may  be  rejected  at  all 
the  usual  levels  of  significance  for  all  but  one  stock,  ABY,  whose  test  statistic  has  a  p- value 
of  33.8  percent.   But  even  in  the  case  of  ABY,  the  point  estimates  of  the  coefficients  do 

A  A  A  A 

seem  to  differ  considerably  [/?3  is  half  of  02  and  0^  is  about  one-third  of  0^],  and  our 
failure  to  reject  is  due  primarily  to  imprecise  parameter  estimates  [note  that  0^  and  0^  are 
not  statistically  significant].  These  findings  support  Easley  and  O'Hara's  observation  that 
information-based  trading  can  lead  to  path-dependent  price  changes,  so  that  the  order 
flow  [and  the  entire  history  of  other  variables]  may  affect  the  conditional  distribution  of 
the  next  price  change. 
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6.2.  Measuring  Price  Impact  Per  Unit  Volume  of  Trade. 

By  price  impact  we  meaji  the  effect  of  a  current  trade  of  a  given  size  on  the  conditional 
distribution  of  the  subsequtnt  price  change.  As  such,  the  coefficients  of  the  variables 
Tx(Vk-j) '  IBSjt_y,  j  =  1,  2,  3  measure  the  price  impact  of  trades  per  unit  of  transformed 
dollar  volume.  More  precisely,  recall  that  our  definition  of  the  volume  variable  is  the  Box- 
Cox  transformation  of  dollar  volume  divided  by  100,  hence  the  coefficient  Pn  for  stock  t  is 
the  contribution  to  the  conditional  mean  XJJ./3  that  results  from  a  trade  of  $100-(1  -f  AJ^/^« 
[since  Tx{{l  +  \)^'  ')  =  l|-  Therefore,  the  impact  of  a  trade  of  size  %M  at  time  fc  —  1  on 
X'i^0  is  simply  (3iiTx{M/lOO).  Now  the  estimated  ^n's  in  Table  2a  are  generally  positive 
and  significant,  with  the  most  recent  trade  having  the  lajgest  impact.  But  this  is  not  the 
impact  we  seek  since  Xj^P  is  the  conditional  mean  of  the  unobserved  variable  Z^,  not  the 
observed  price  change  Z^.  In  particulzir,  since  X'j^/3  is  scaled  by  cr^.  in  (2.10),  it  is  difficult 
to  make  meaningful  comparisons  of  the  ^n's  across  stocks. 

To  obtain  a  measure  of  a  trade's  price  impact  that  we  can  compare  across  stocks, 
we  must  translate  the  imp2u:t  on  X'f^P  into  an  impact  on  the  conditional  distribution 
of  the  Zjt's,  conditioned  on  the  trade  size  ajid  other  quantities.  Since  we  have  already 
established  that  the  conditional  distribution  of  price  changes  is  order-flow  dependent,  we 
miist  condition  on  a  specific  sequence  of  past  price  changes  ajid  trade  sizes.  We  do  this 
by  substituting  our  parameter  estimates  into  (2.10),  choosing  paxticulzLr  values  for  the 
explajiatory  variables  X^,  and  computing  the  probabilities  explicitly.  In  particulaj,  for 
each  stock  t  we  set  At^  and  ABjt_i  to  their  szLmple  means  for  that  stock  and  set  the 
remaining  regressors  to  the  following  values: 


Vl_2  =     •  Mediaji  Dollar  Volume  for  Stock  t 

*  ^  100 

Vjfc-s  =     -—r  •  Median  Dollar  Volume  for  Stock  t 

*  "•  100 


SP500jfc_i  =  0.001 

SP500jfc_2  =  0.001 

SP500;t_3  =  0.001 

IBSfc.i  =  1 
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IBSjt_2     =     1 
IBSjfc_3    =     1 


Specifying  values  for  these  variables  is  equivalent  to  specifying  the  market  conditions  that 
we  wish  to  measure  price  impact  under.  These  particular  values  correspond  to  a  scenario 
in  which  the  most  recent  three  trades  axe  buys,  where  the  sizes  of  the  two  earlier  trades  are 
equal  to  the  stock's  median  dollar  volume,  and  where  the  market  has  been  rising  during 
the  past  15  minutes.  We  then  evaluate  the  probabilities  in  (2.10)  for  different  values  of 
Vjfc-i,  Zk^i,  Zjt_2,  and  Zjfc_3. 

For  brevity,  we  focus  only  on  the  meajis  of  these  conditional  distributions,  which  we 
report  in  Tables  3  and  4  for  the  11  stocks.  The  entries  in  Table  3  are  computed  under 
the  assumption  that  Zjt_i  =  ^ifc_2  =  ^k-Z  —  "^^f  whereas  those  in  Table  4  are  computed 
under  the  assumption  that  ^jt-i  =  ^k-2  —  ^k-Z  —  0-  "^^^  ^^^^  entry  in  the  "IBM" 
column  in  Table  3,-1.315,  is  the  expected  price  change  [in  ticks]  of  the  next  transaction  of 
IBM  following  a  $5,000  buy.  The  seemingly  counterintuitive  sign  of  this  conditional  mean 
is  the  result  of  the  "bid/ask  bounce"  -  since  the  past  three  trades  were  assumed  to  be 
buys,  the  parameter  estimates  reflect  the  empirical  fact  that  the  next  trzinsaction  can  be 
a  sell,  in  which  case  the  transaction  price  change  will  often  be  negative  since  the  price  will 
go  from  zisk  to  bid.  To  account  for  this  effect,  we  would  need  to  include  a  contemporaneous 
buy /sell  indicator,  IBSjt,  in  Xj|.  and  condition  on  this  variable  as  well.  But  such  a  variable 
is  clearly  endogenous  to  Zj,.  and  our  parameter  estimates  would  suffer  from  the  familiar 
simultaneous-equations  biases.^® 

However,  we  can  "net  out"  the  effect  of  the  bid/ask  spread  by  computing  the  change 
in  the  conditional  mean  for  trade  sizes  lajger  than  our  base  case  $5,000  buy.  As  long  as  the 
bid/ask  spread  remains  relatively  stable,  the  change  in  the  conditional  meain  induced  by 
larger  trades  will  give  us  a  measure  of  price  impact  that  is  independent  of  it.  In  particular, 
the  second  entry  in  the  "IBM"  column  of  Table  3  shows  that  purchasing  zm  additional 


^*In  fact,  including  the  contemporaneout  buy/*ell  indicator  IBS/t  and  contemporaneoui  traniformed  volume  T>(V;t)  would 
yield  a  more  natural  measure  of  price  impact,  since  luch  a  ipecification,  when  coniiitently  estimated,  can  be  used  to  quantify  the 
expected  total  cost  of  transacting  a  given  volume.  Unfortunately,  there  are  few  circumstances  in  which  the  contemporaneous 
buy/sell  indicator  IBS;i  may  be  considered  exogenous,  since  simple  economic  intuition  suggests  that  factors  affecting  price 
changes  must  also  enter  the  decision  to  buy  or  sell.  Indeed,  limit  orders  are  explicit  functions  of  the  current  price.  Therefore, 
if  IBSt  is  to  be  included  as  an  explanatory  variable  in  Xk,  its  endogeneity  must  be  taken  into  account.  Unfortunately,  the 
standard  estimation  techniques  such  as  two-stage  or  three-stage  least  squares  do  not  apply  here  because  of  our  discrete  dependent 
variable.  Moreover,  techniques  that  allow  for  discrete  dependent  variables  cannot  be  applied  because  the  endogenous  regressor 
IBSit  is  also  discrete.  In  principle,  it  may  be  possible  to  derive  consistent  estimators  by  considering  a  joint  ordered  probit  model 
for  both  vxiriables,  but  this  is  beyond  the  scope  of  the  current  paper.  For  this  reason,  we  restrict  our  specification  to  include 
only  lags  of  IBSi  and  Vt. 
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$5,000  of  IBM  [$10,000  total]  increases  the  conditional  mean  by  0.060  ticks.  However, 
purchasing  an  additional  $495,000  of  IBM  [$500,000  total]  increases  the  conditional  mean 
by  0.371  ticks  -  as  expected,  trading  a  lairger  quantity  always  yields  a  larger  price  impact. 

A  comparison  across  columns  in  the  upper  panel  of  Table  3  shows  that  large  trades 
have  higher  price  impact  for  CUE  than  for  the  other  ten  stocks.  However,  such  a  compari- 
son ignores  the  fact  that  these  stocks  trade  at  different  price  levels,  hence  a  price  impact  of 
0.473  ticks  for  $500,000  of  CUE  may  not  be  as  large  a  percentage  of  price  as  a  price  impact 
of  0.191  ticks  for  $500,000  of  NAV.  The  lower  panel  of  Table  3  reports  the  price  impact  as 
percentages  of  the  average  of  the  high  and  low  prices  of  each  stock,  and  a  trade  of  $500,000 
does  have  a  higher  percentage  price  impact  for  NAV  than  for  CUE  -  0.434  percent  versus 
0.068  percent  -  even  though  its  impact  is  considerably  smaller  when  measured  in  ticks. 
Interestingly,  even  as  a  percentage,  price  impact  increases  with  dollar  volume. 

In  Table  4  where  price  impact  values  have  been  computed  under  the  alternative  as- 
sumption that  Zjt-i  =  ^it-2  =  ^k-3  =  0»  t^c  conditional  means  E[Zjt]  are  closer  to  zero 
for  the  $5,000  buy.  For  example,  the  expected  price  change  of  NAV  is  now  -0.235  ticks, 
whereas  in  Table  3a  it  was  -1.670  ticks.  Since  we  axe  now  conditioning  on  a  different 
scenario,  in  which  the  three  most  recent  transactions  are  buys  that  have  no  impact  on 
prices,  the  empirical  estimates  imply  more  probability  in  the  right  tail  of  the  conditional 
distribution  of  the  subsequent  price  change. 

That  the  conditional  mean  is  still  negative  may  signal  the  continued  importance  of 
the  bid/ask  spread,  nevertheless  the  price  impact  measure  AE[Zjt]  does  increase  with 
dollar  volume.  Moreover,  these  values  are  similar  in  magnitude  to  those  in  Table  3  -  in 
percentage  terms  the  price  impact  is  virtually  the  same  in  both  tables  for  most  of  the 
11  stocks.  However,  for  NAV,  RBK  and  T  the  percentage  price  impact  measures  differ 
considerably  between  Tables  3  and  4,  suggesting  that  price  impact  must  be  meeisured 
security  by  security. 

Of  course,  there  is  no  reason  to  focus  solely  on  the  mean  of  the  conditional  distribution 
of  Zjf  since  we  have  at  our  disposal  an  estimate  of  the  entire  distribution.  Under  the 
scenarios  of  Tables  3  and  4  we  have  also  computed  the  standard  deviations  of  conditional 
distributions,  but  since  they  eire  quite  stable  across  the  two  scenarios  we  have  omitted  them 
from  the  tables  for  the  sake  of  brevity.  However,  to  get  a  sense  of  their  sensitivity  to  the 
conditioning  variables,  we  have  plotted  in  Figure  3  the  estimated  conditional  probabilities 
for  the  11  stocks  under  both  scenarios.  In  each  graph,  the  lightly  cross-hatched  bars 
represent  the  conditional  distribution  for  the  sequence  of  three  buys  with  a  +1  tick  price 
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change  at  each  trade,  with  a  fixed  trade  size  equal  to  the  sample  median  volume  for  each. 
The  dark-shaded  bars  represent  the  conditional  distribution  for  the  same  sequence  of  three 
buys  but  with  zero  price  change  for  each  of  the  three  trainsactions,  also  each  for  a  fixed 
trade  size  equal  to  the  sample  median.  The  conditional  distribution  is  clearly  shifted  more 
to  the  right  under  the  first  scenario  than  under  the  second,  as  the  conditional  means  in 
Tables  3  and  4  foreshadowed.  However,  the  general  shape  of  the  distribution  seems  rather 
well-preserved  -  changing  the  path  of  past  price  changes  seems  to  translate  the  conditional 
distribution  without  greatly  altering  the  tail  probabilities. 

As  a  final  summary  of  price  impact  for  these  securities,  we  plot  "price  response" 
functions  in  Figure  4  for  the  11  stocks,  which  gives  the  percentage  price  impact  as  a 
function  of  dollar  volimie.  The  price  response  function  reveals  several  features  of  the 
majket  microstructure  that  are  not  as  apparent  from  the  numbers  in  Tables  3  and  4.  For 
example,  market  liquidity  is  often  defined  as  the  ability  to  trade  any  volume  with  little 
or  no  price  impact,  hence  in  very  liquid  markets  the  price  response  function  should  be 
constant  at  zero  -  a  flat  price  response  function  implies  that  the  percentage  price  impact 
is  not  affected  by  the  size  of  the  trade.  Therefore  a  visual  measure  of  liquidity  is  the 
curvature  of  the  price  response  function;  it  is  no  surprise  that  IBM  possesses  the  flattest 
price  response  function. 

More  generally,  the  shape  of  the  price  response  function  measures  whether  there  are 
any  economies  or  dis-economies  of  scale  in  trading.  An  upward-sloping  curve  implies  dis- 
economies of  scale  -  larger  dollar  volume  trades  will  yield  higher  percentage  price  impact. 
As  such,  the  slope  may  be  one  measure  of  "market  depth."  For  example,  if  the  market 
for  a  security  is  "deep,"  this  is  usually  taken  to  mean  that  large  volumes  may  be  traded 
before  much  of  a  price  impact  is  observed.  In  such  cases,  the  price  response  function  may 
even  be  downward  sloping.  In  Figure  4,  all  11  stocks  exhibit  trading  dis-economies  of  scale 
since  the  price  response  functions  aie  all  upward-sloping  but  they  increase  at  a  decreasing 
rate.  Such  dis-economies  of  scale  suggest  that  it  might  pay  to  break  up  large  trades  into 
sequences  of  smaller  ones.  However,  recall  that  the  values  in  Figure  4  are  derived  from 
conditional  distributions,  conditioned  on  particular  sequences  of  trades  and  prices.  A 
comparison  of  the  price  impact  of,  say,  one  $100,000  trade  with  two  $50,000  trades  can  be 
performed  only  if  the  conditional  distributions  are  recomputed  to  account  for  the  diff'erent 
sequences  implicit  in  the  two  alternatives.  Since  these  two  distinct  sequences  have  not  been 
accounted  for  in  Figure  4,  the  benefits  of  dividing  large  trades  into  smaller  ones  cannot 
be  inferred  from  it.   Nevertheless,  with  the  ML  estimates  in  hand,  such  comparisons  are 
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trivial  to  calculate  on  a  case-by-case  basis. 

Since  price  response  functions  are  defined  in  terms  of  percentage  price  impact,  cross- 
stock  compzu-isons  of  liquidity  can  also  be  made.  Figure  4  shows  that  NAV,  RBK  and 
FWC  are  considerably  less  liquid  than  the  other  stocks.  This  is  partly  due  to  the  low  price 
ranges  that  the  three  stocks  traded  in  during  1988  [see  Table  l]  -  although  RBK  and  S 
have  comparable  price  impacts  when  measured  in  ticks  [see  Table  3],  RBK  looks  much 
less  liquid  when  impact  is  measured  as  a  percentage  of  price  since  its  share  price  traded 
between  $10,250  and  $18,375  whereas  S  traded  between  $32,250  and  $46,250  during  1988. 
Not  surprisingly,  since  their  price  ranges  are  jmaong  the  highest  in  the  sample,  IBM,  CUE 
and  DOW  have  the  lowest  price  response  functions. 

6.3.  Does  Discreteness  Matter? 

Despite  the  elegajice  ajid  generality  with  which  the  ordered  probit  framework  accounts 
for  price  discreteness,  irregular  trading  intervals,  and  the  influence  of  explanatory  variables, 
the  complexity  of  the  estimation  procedure  raises  the  question  of  whether  these  features 
can  be  satisfactorily  addressed  by  a  simpler  model.  Since  ordered  probit  may  be  viewed 
£is  a  generalization  of  the  linear  regression  model  to  discrete  dependent  variables,  it  is  not 
surprising  that  the  latter  may  share  many  of  the  advantages  of  the  former,  price  discreteness 
aside.  However,  linear  regression  is  considerably  easier  to  implement.  Therefore,  what  is 
gained  by  ordered  probit?  For  example,  suppose  we  ignore  the  fact  that  price  changes  Z^. 
are  discrete,  estimate  the  following  simple  regression  model  via  ordinary  least  squares: 

Zk     =     X'l^P  +  €k  (6.3) 

and  then  compute  the  conditional  distribution  of  Zj^  by  rounding  to  the  nearest  eighth, 
thus: 


Pr(Z,  =  i)    =    P.(i--L<xi/3  +  e,  <^  +  i) 


(6.4) 


With  suitable  restrictions  on  the  ejt's,  the  regression  model  (6.3)  is  known  as  the  "linear 
probability"  model.  The  problems  associated  with  applying  ordinary  least  squares  to  (6.3) 
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are  well-known  [see  for  example  Judge  et  al.  (1985,  Ch.  18.2.1)],  and  numerous  extensions 
have  been  developed  to  account  for  such  problems.  However,  implementing  such  extensions 
is  at  least  as  involved  as  maximum  likelihood  estimation  of  the  ordered  probit  model  and 
therefore  the  comparison  is  of  less  immediate  interest.  In  spite  of  these  problems,  we  may 
still  ask  whether  the  OLS  estimates  of  (6.3)  and  (6.4)  yield  an  adequate  "approximation"  to 
a  more  formal  model  of  price  discreteness.  Specifically,  how  different  are  the  probabilities 
in  (6.4)  from  those  of  the  ordered  probit  model?  If  the  differences  are  small,  then  the 
linear  regression  model  (6.3)  may  be  an  adequate  substitute  to  ordered  probit. 

Under  the  assumption  of  i.i.d.  Gaussian  ejfc's,  we  evaluate  the  conditional  probabilities 
in  (6.4)  using  the  OLS  parameter  estimates  and  the  same  values  for  the  X^s  as  in  Section 
6.2,  and  graph  them  and  the  corresponding  ordered  probit  probabilities  in  Figure  5.  These 
graphs  show  that  the  two  models  can  yield  very  different  conditional  probabilities.  All  of 
the  OLS  conditional  distributions  are  unimodal  and  have  little  weight  in  the  tails,  in  sharp 
contrast  to  the  much  more  varied  conditional  distributions  generated  by  ordered  probit. 
For  example,  the  OLS  conditional  probabilities  show  no  evidence  of  the  non-monotonicity 
that  is  readily  apparent  from  the  ordered  probit  probabilities  of  CUE,  NAV  and,  to  a 
lesser  extent,  RBK.  In  particular,  for  NAV  and  RBK  a  price  change  of  —3  ticks  is  clearly 
less  probable  than  either  —2  or  —4  ticks,  and  for  CUE,  a  price  change  of  —1  tick  is  less 
probable  than  of  —2  ticks. 

Nevertheless  for  some  of  the  11  stocks,  such  as  DOW,  FNB  and  FWC,  the  OLS 
and  ordered  probit  probabilities  are  rather  close.  However,  it  is  dsLngerous  to  conclude 
from  these  matches  that  OLS  is  generally  acceptable,  since  these  conditional  distributions 
depend  sensitively  on  the  values  of  the  conditioning  variables.  For  example,  we  have 
plotted  these  probabilities  conditioned  on  much  higher  values  for  the  conditional  variance 
a|,  and  in  these  cases  there  are  strong  differences  between  the  OLS  and  ordered  probit 
distributions  for  all  11  stocks. 

That  OLS  ajid  ordered  probit  can  differ  is  not  surprising  given  the  extra  degrees 
of  freedom  that  the  ordered  probit  model  h«is  to  fit  the  conditional  distribution  of  price 
changes.^^  Because  the  ordered  probit  patrtition  boundaries  {o^}  axe  determined  by  the 
data,  the  tail  probabilities  of  the  conditional  distribution  of  price  changes  may  be  large 


'^In  fact,  several  colleagues  have  pointed  out  to  us  that  the  comparison  of  OLS  and  ordered  probit  is  not  a  fair  one  because 
of  these  extra  degrees  of  freedom  [for  example,  we  could  have  allowed  the  OLS  residual  variance  to  be  heteroakedastic].  But 
this  misses  the  point  of  our  comparison,  which  was  not  meant  to  be  fair.  Our  goal  was  to  see  whether  a  $impter  technique  could 
provide  the  same  information  that  a  more  complex  technique  like  ordered  probit  does.  It  should  come  as  no  surprise  that  OLS 
can  come  close  to  fitting  nonlinear  phenomena  if  it  is  suitably  extended  [in  fact,  ordered  probit  is  one  such  extension].  But 
such  an  extended  OLS  analysis  is  generally  as  complicated  to  perform  as  ordered  probit,  making  the  comparison  less  relevant 
for  our  purposes. 
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or  small  relative  to  the  probabilities  of  more  central  observations,  unlike  the  probabilities 
implied  by  (6.3)  which  are  dictated  by  the  [Gaussian]  distribution  function  of  ejt-  Moreover, 
it  is  unlikely  that  using  another  distribution  function  will  provide  as  much  flexibility  as 
ordered  probit,  for  the  simple  rezison  that  (6.3)  constrains  the  state  probabilities  to  be 
linear  in  the  Xjt's  [hence  the  term  "linear  probability  model"),  whereas  ordered  probit 
allows  for  nonlinear  effects  by  letting  the  data  determine  the  partition  boundairies  {oc^}. 

A  more  direct  test  of  the  difference  between  ordered  probit  and  the  simple  "rounded" 
linear  regression  model  is  to  consider  the  special  case  of  ordered  probit  in  which  all  the 
partition  boundaries  {o^}  axe  equally  spaced  and  fall  on  sixteenths.  That  is,  let  the 
observed  discrete  price  change  Zjt  be  related  to  the  unobserved  continuous  random  variable 
Zt  in  the  following  majiner: 


^     -I  or  less  if  Z*   €   (-00,  -1  +  ^) 


if  ^fc    e    [  J-A.  i  +  A).   y=-3,...,3   .        (6.5) 


I  or  more  if  -^jfc   €   [  g  -  ^  ,  00  ) 


This  follows  the  spirit  of  Ball  (1988),  in  which  there  exists  a  "virtual"  or  "true"  price 
change  ZJ^  linked  to  the  observed  price  change  Zj^  by  rounding  ZJ^  to  the  nearest  multiple 
of  eighths  of  a  dollar.  A  testable  implication  of  (6.5)  is  that  the  partition  boundaries  {o^} 
are  equally-spaced,  i.e., 


a2  -  ai     =     as  -  Q2     =     •  ■  •     =     Om-l  -  <^m-2 


(6.6) 


where  m  is  the  number  of  states  in  our  ordered  probit  model.  We  can  re-write  (6.6)  as  a 
linear  hypothesis  for  the  (m  —  l)  x  l-vector  of  a's  in  the  following  way: 


9.4 


H:  Aa    =    0 
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Since  the  asymptotic  distribution  of  the  maximum  likelihood  estimator  a  is  given  by: 


v/r(d-Q)     ~    A^(0,E) 


(6.9) 


where  E  is  the  appropriate  sub-matrix  of  the  inverse  of  the  information  matrix  correspond- 
ing to  the  likelihood  function  (2.11),  the  "delta  method"  yields  the  asymptotic  distribution 
of  the  following  statistic  xl)  under  the  null  hypothesis  H: 


H 


0    =    Ta'A!{AY.A')-^Aa 


Xm-3 


(6.10) 


Table  5  reports  the  0's  for  our  sample  of  11  stocks,  and  since  the  1  percent  critical  values 
of  the  X2  3-^<^  Xe  ^^^  ^-21  3^<i  16.8  respectively,  we  can  easily  reject  the  null  hypothesis  H 
for  eew:h  of  the  11  stocks.  However,  because  our  sample  sizes  are  so  Icirge,  large  x^  statistics 
need  not  signal  important  economic  departures  from  the  null  hypothesis.  Nevertheless  the 
point  estimates  of  the  a's  in  Tables  2a,b  show  that  they  do  differ  in  economically  im.portant 
ways  from  the  simpler  rounding  model  (6.5).  With  CUE,  for  example,  0:3  —  02  is  2.652  but 
0:4  —  0:3  is  1.031.  Such  a  difference  captures  the  empirical  fact  that,  conditioned  on  the 
X^.'s  and  VVjt's,  —1  tick  changes  axe  less  frequent  than  —2  tick  changes,  even  less  frequent 
than  predicted  by  the  simple  linear  probability  model.  Discreteness  does  matter. 


7.  An  Extended  Sample. 

Although  our  sample  of  11  securities  contains  several  hundred  thousand  observations, 
it  is  still  only  a  small  cross-section  of  the  ISSM  database  which  contains  the  transactions 
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of  over  two  thousand  stocks.  Although  it  would  be  impractical  for  us  to  estimate  our 
ordered  probit  model  for  each  one,  we  do  apply  our  specification  to  an  extended  sample  of 
100  securities  chosen  randomly,  20  from  each  of  market-value  deciles  6  through  10  [decile 
10  contains  companies  with  beginning-of-year  market  values  in  the  top  10  percent  of  the 
entire  database],  also  with  the  restriction  that  none  of  these  100  engaged  in  stock  splits  or 
stock  dividends  greater  than  or  equal  to  3:2.  Table  6  lists  the  compajiies'  names,  ticker 
symbols,  market  values,  and  number  of  trades  included  in  our  final  samples. 

We  did  not  select  any  securities  from  deciles  1  through  5  because  many  of  those 
securities  are  so  thinly  traded  that  the  small  sample  sizes  would  not  permit  accurate 
estimation  of  the  ordered  probit  parameters.  For  example,  even  in  deciles  6,  7  and  8, 
containing  companies  ranging  from  $133  million  to  $946  million  in  market  value,  there 
were  still  six  companies  for  which  the  maximum  likelihood  estimation  procedure  did  not 
converge:  MCI,  NET,  OCQ,  NPR,  SIX  and  SW.  In  all  of  these  cases,  the  sample  sizes 
were  relatively  small,  yielding  ill-behaved  and  erratic  likelihood  functions. 

Table  7  presents  summary  statistics  for  this  sample  of  100  securities  broken  down  by 
deciles.  As  expected,  the  Icirger  stocks  tend  to  have  higher  prices,  lower  time-between- 
trades,  higher  bid/ask  spreads  [in  ticks],  and  larger  median  dollar  volume  per  trade.  Note 
that  the  statistics  for  T^  (Vj^)  -IBSj^  implicitly  include  estimates  A  of  the  Box-Cox  parameter 
which  differ  across  stocks.  Also,  although  the  mean  and  standard  deviation  of  T\(Vjt)  -IBSj;. 
for  decile  6  differ  dramatically  from  those  of  the  other  deciles,  these  differences  are  driven 
solely  by  the  outlier  XTR.  When  this  security  is  dropped  from  decile  6,  the  mean  and 
standard  deviation  of  Tx(Vk)  *  ^^^k  become  —0.0244  and  0.3915  respectively,  much  more 
in  line  with  the  values  of  the  other  deciles. 

In  Table  8  we  summarize  the  price  impact  measures  across  deciles,  where  we  now 
define  price  impact  to  be  the  increase  in  the  conditional  expected  price  change  as  dollar 
volume  increases  from  a  baise  case  of  $1,000  to  either  the  median  dollar  volume  for  each 
individual  stock  [the  first  panel  of  Table  8)  or  a  dollar  volume  of  $100,000  [the  second 
panel].  The  first  two  rows  of  both  panels  report  decile  means  ajid  steindard  deviations 
of  the  absolute  price  impact  [measured  in  ticks],  whereas  the  second  two  rows  of  both 
panels  report  decile  means  and  standard  deviations  of  percentage  price  impact  [measured 
as  percentages  of  the  mean  of  the  high  and  low  prices  of  each  stock].  For  each  stock  i,  we 
set  Afjt  and  ABjt-i  *°  their  sample  means  for  that  stock  and  condition  on  the  following 


^*  We  &lio  discarded  [without  replacement]  randomly  chosen  stocks  that  were  obviously  mutual  funds,  replacing  them  with 
new  random  draws. 
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values  for  the  other  regressors: 


Vjfc-2 

= 

100 

Vjt-3 

=: 

1 
100 

SP500jt_i 

= 

0.001 

SP500)fc_2 

= 

0.001 

SP500jt_3 

= 

0.001 

Zk-l 

= 

1 

Zk-2 

= 

1 

^k-Z 

= 

1 

IBSfc.i 

= 

1 

IBS;k_2 

= 

1 

IBSjt-3 

= 

1 

Median  Dollar  Volume  for  Stock  t 


Median  Dollar  Volume  for  Stock  t 


so  that  we  are  assuming  the  three  most  recent  trades  are  buyer-initiated,  accompanied  by 
price  increases  of  1  tick  each,  and  the  sizes  of  the  two  earlier  trades  are  equal  to  the  median 
dollar  volimie  of  the  particular  stock  in  question. 

From  Table  8  we  see  that  conditional  on  a  dollar  volume  equal  to  the  median  for 
the  most  recent  trade,  larger  capitalization  stocks  tend  to  exhibit  larger  absolute  price 
impact,  no  doubt  due  to  their  higher  prices  and  their  larger  median  dollar  volumes  per 
trade.  However,  as  percentages  of  the  average  of  their  high  and  low  prices,  the  price  impact 
across  deciles  is  relatively  constant  as  shown  by  the  third  row  in  the  first  panel  of  Table 
8:  the  average  price  impact  for  a  median  trade  in  decile  6  is  0.0612  percent,  compared  to 
0.0523  percent  in  decile  10.  When  conditioning  on  a  dollar  volume  of  $100,000  however,  the 
results  are  quite  different:  the  average  absolute  price  impact  is  similar  across  deciles,  but 
the  average  relative  price  impact  is  considerably  smaller  in  decile  10  [0.0778  percent]  than 

9.4  -  36  -  10.91 


in  decile  6  [0.2250  percent].   Not  surprisingly,  a  fixed  $100,000  trade  will  have  a  greater 
percentage  price  impact  on  smaller  capitalization,  less  liquid  stocks  than  on  larger  ones. 

Further  insights  on  how  price  impax;t  varies  cross-sectionally  can  be  gained  from  the 
cross-sectional  regressions  in  Table  9,  where  the  four  price  impact  measures  and  the  Box- 
Cox  parameter  estimates  are  each  regressed  on  the  following  four  variables:  market  value, 
the  initial  price  level,  median  dollcir  volume,  and  median  time-between  trades.  Entries  in 
the  first  row  show  that  the  Box-Cox  parameters  Jire  inversely  related  to  all  four  variables, 
though  none  of  the  coefficient  estimates  are  statistically  significant  and  the  adjusted  R} 
is  negative,  a  symptom  of  the  imprecision  with  which  the  A^'s  are  estimated.  But  the  two 
percentage  price  impact  regressions  seem  to  have  higher  explanatory  power,  with  adjusted 
i2^'s  of  37.6  and  22.1  percent,  respectively.  These  two  regressions  have  identical  sign 
patterns,  implying  that  percentage  price  impact  is  larger  for  smaller  stocks,  lower  priced 
stocks,  higher  volume  stocks,  and  stocks  that  trade  less  frequently. 

In  Table  10,  we  report  Spearman  rank  correlations  between  the  dependent  and  in- 
dependent variables  of  Table  9,  which  are  nonpaxametric  measures  of  association  and  axe 
asymptotically  normal  with  mean  0  and  variance  l/(n  —  l)  under  the  null  hypothesis  of 
pairwise  independence  [see,  for  example,  Randies  and  Wolfe  (1979)].  Since  n  =  94,  the 
two  standard  error  confidence  interval  about  0  for  each  of  the  correlation  coefficients  is 
[  —0.207  ,  0.207  ].  The  sign  patterns  axe  much  the  same  in  Table  10  as  in  Table  9,  despite 
the  fact  that  the  Spearman  rank  correlations  are  not  partial  correlation  coefficients. 

Of  course,  such  cross-sectional  regressions  and  rank  correlations  serve  only  as  informal 
summaries  of  the  data  since  they  are  not  formally  linked  to  any  explicit  theories  of  how 
price  impact  should  vary  across  stocks.  Nevertheless  they  are  consistent  with  our  earlier 
findings  from  the  11  stocks,  suggesting  that  those  results  are  not  specific  to  the  behavior  of 
a  few  possibly  peculiar  stocks,  but  may  be  evidence  of  a  more  general  and  stable  mechanism 
for  transaction  prices. 

8.  Conclusion. 

Using  1988  transactions  data  from  the  ISSM  database,  we  find  that  the  sequence  of 
trades  does  afi"ect  the  conditional  distribution  for  price  changes,  and  the  effect  is  greater  for 
larger  capitalization  and  more  actively  traded  securities.  Trade  size  is  also  an  important 
factor  in  the  conditional  distribution  of  price  changes,  with  larger  trades  creating  more 
price  pressure,  but  in  a  nonlinear  fashion.  The  price  impact  of  a  trade  depends  critically 
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on  the  sequence  of  past  price  changes  and  order  flows  [buy/sell/buy  versus  sell/buy/buy]. 
The  ordered  probit  framework  allows  us  to  compare  the  price  impact  of  trading  over  many 
different  market  scenarios,  such  as  trading  "Svith"  versus  "against"  the  market,  trading  in 
"up  and  down"  markets,  etc.  Finally,  we  show  that  discreteness  does  matter,  in  the  sense 
that  the  simpler  lineair  regression  analysis  of  price  changes  cannot  capture  all  the  features 
of  transaction  price  changes  evident  in  the  ordered  probit  estimates,  such  as  the  clustering 
of  price  changes  on  even  eighths. 

With  these  simple  applications,  we  hope  to  have  shown  that  the  ordered  probit  model 
is  a  flexible  and  powerful  tool  for  investigating  the  dynamic  behavior  of  transax:tion  prices. 
Much  like  the  linear  regression  model  for  continuous-valued  data,  the  ordered  probit  model 
caji  capture  and  summarize  complex  relations  between  discrete-valued  and  continuous- 
valued  data.  Indeed,  even  in  the  simple  applications  we  considered  here,  we  suffered  from 
an  embarrassment  of  riches  in  that  there  were  many  other  empirical  implications  of  our 
ordered  probit  estimates  that  we  did  not  have  space  to  report.  For  example,  we  compared 
the  price  impact  of  only  one  or  two  sequences  of  order  flows,  price  history,  and  market 
return  -  there  are  many  other  combinations  of  market  conditions,  some  that  might  yield 
considerably  different  findings.  By  choosing  other  scenarios,  a  deeper  understanding  of 
how  transaction  prices  react  to  changing  market  conditions  can  be  obtained. 

Although  we  selected  a  wide  range  of  regressors  to  illustrate  the  flexibility  of  ordered 
probit,  in  practice  the  specific  application  will  dictate  which  regressors  to  include.  If,  for 
example,  one  is  interested  in  testing  the  implications  of  Admati  and  Pfleiderer's  (1988) 
model  of  intra-day  patterns  in  price  and  volume,  time-of-day  indicators  in  the  conditional 
mean  and  variance  could  be  added.  If  one  is  interested  in  measuring  how  liquidity  and 
price  impact  vajies  across  markets,  an  exchange  indicator  would  be  appropriate.  For  intra- 
day  event  studies,  "event"  indicators  in  both  the  conditional  mean  and  variance  are  the 
natural  regressors,  and  in  such  cases  the  generalized  residuals  we  calculated  as  diagnostics 
can  also  be  used  to  construct  cumulative  average  [generalized]  residuals. 

In  our  simple  applications,  we  have  only  hinted  at  the  kinds  of  insights  that  ordered 
probit  can  yield;  the  possibilities  expand  exponentially  as  we  consider  the  many  ways 
our  basic  specification  can  be  changed  to  accommodate  the  growing  number  of  highly 
parametrized  and  less  stylized  theories  about  the  market  microstructure.  We  expect  to  see 
many  other  applications  in  the  near  future. 
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Table  3 

Price  impact  of  trades  as  measured  by  the  change  in  conditional  mean  of  Z^,  or  AE[Z^], 
when  trade  sizes  are  increased  incrementally  above  the  base  case  of  a  $5,000  trade.  These 
changes  are  computed  from  the  ordered  probit  probabilities,  conditional  on  the  three  most 
recent  trades  being  buyer-initiated,  and  the  three  most  recent  price  changes  being  +1  tick 
each,  for  International  Business  Machines  Corporation  (IBM  -  206,794  trades),  Abitibi- 
Price  Incorporated  (ABY  -  1,145  trades).  Quantum  Chemical  Corporation  (CUE  -  26,927 
trades),  Dow  Chemical  Company  (DOW  -  81,890  trades),  and  First  Chicago  Corporation 
(FNB  -  17,783  trades),  and  Foster  Wheeler  Corporation  (FWC  -  18,199  trades),  for  the 
sample  period  from  4  January  1988  to  30  December  1988.  Percentage  price  impact  is 
computed  as  a  percentage  of  the  average  of  the  high  and  low  prices. 


$  Volume 

IBM 

ABY 

CUE 

DOW 

FNB 

FWC 

(Ticks) 

E[Zfc  :        5,000 

-1.315 

-0.350 

-0.629 

-1.117 

-0.790 

-0.956 

AE[Zfc]:    10,000 

0.060 

0.027 

0.072 

0.057 

0.037 

0.025 

AE[Zj,]:    20,000 

0.118 

0.053 

0.144 

0.114 

0.073 

0.054 

AEJZfc  :    50,000 

0.193 

0.088 

0.239 

0.188 

0.121 

0.096 

AE{Zk]:  100,000 

0.248 

0.113 

0.310 

0.242 

0.157 

0.133 

AE[Zk]:  250,000 

0.319 

0.147 

0.403 

0.313 

0.203 

0.189 

AEJZjtj:  500,000 

0.371 

0.173 

0.473 

0.366 

0.238 

0.236 

{%  of  Price) 

E[Zk]:        5,000 

-0.141 

-0.235 

-0.090 

-0.164 

-0.363 

-0.831 

AE[Zfc]:    10,000 

0.006 

0.018 

0.010 

0.008 

0.017 

0.022 

AE[Zfc]:    20,000 

0.013 

0.036 

0.021 

0.017 

0.034 

0.047 

AE[Zi,:    50,000 

0.021 

0.059 

0.034 

0.027 

0.056 

0.084 

AE  Zk]:  100,000 

0.027 

0.076 

0.045 

0.036 

0.072 

0.116 

AE  Zk]:  250,000 

0.034 

0.099 

0.058 

0.046 

0.093 

0.164 

AE[Zi,]:  500,000 

0.040 

0.116 

0.068 

0.054 

0.109 

0.205 

9.4.3 
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Table  3  (Continued) 

Price  impact  of  trades  as  measured  by  the  change  in  conditional  mean  of  Zjt,  or  AEfZ;;.], 
when  trade  sizes  are  increased  incrementally  above  the  base  case  of  a  $5,000  trade.  These 
changes  are  computed  from  the  ordered  probit  probabilities,  conditional  on  the  three  most 
recent  trades  being  buyer-initiated,  and  the  three  most  recent  price  changes  being  +1 
tick  each,  for  Handy  and  Harman  Company  (HNH  -  3,174  trades),  Navistar  International 
Corporation  (NAV  -  96,127  trades),  Reebok  International  Limited  (RBK  -  62,778  trades). 
Sears  Roebuck  and  Compziny  (S  -  94,127  trades),  and  American  Telephone  ajid  Telegraph 
Company  (T  -  180,726  trades),  for  the  sample  period  from  4  January  1988  to  30  December 
1988.  Percentage  price  impax:t  is  computed  as  a  percentage  of  the  average  of  the  high  and 
low  prices. 


$  Volume 

HNH 

NAV 

RBK 

S 

T 

(Ticks) 

E[Zjt]:        5,000 

-0.621 

-1.670 

-1.459 

-1.492 

-1.604 

AE  Z^]:     10,000 

0.019 

0.017 

0.035 

0.050 

0.022 

AE  Zk\:    20,000 

0.041 

0.037 

0.075 

0.063 

0.046 

AE[Zjt]:    50,000 

0.074 

0.070 

0.137 

0.109 

0.082 

AE[Zjt]:  100,000 

0.103 

0.100 

0.192 

0.146 

0.113 

AE  Zjtj:  250,000 

0.148 

0.148 

0.276 

0.200 

0.159 

AE[Z)t]:  500,000 

0.188 

0.191 

0.350 

0.243 

0.197 

[%  of  Price) 

E[Z;fc]:        5,000 

-0.474 

-3.796 

-1.275 

-0.475 

-0.736 

AE[Zjt]:     10,000 

0.015 

0.038 

0.030 

0.010 

0.010 

AE[Z^]:     20,000 

0.031 

0.084 

0.065 

0.020 

0.021 

AEJZjt]:    50,000 

0.057 

0.158 

0.120 

0.035 

0.038 

AE  Zk\:  100,000 

0.079 

0.227 

0.168 

0.047 

0.052 

AE[Zfc]:  250,000 

0.113 

0.336 

0.241 

0.064 

0.073 

AE[Z;fc]:  500,000 

0.143 

0.434 

0.305 

0.077 

0.090 

9.4.3 


10.22.91 


Table  4 

Price  impact  of  trades  as  mesisured  by  the  change  in  conditional  mean  of  Zj^,  or  AE[Z;t]» 
when  trade  sizes  are  increased  incrementally  above  the  base  case  of  a  $5,000  trade.  These 
changes  are  computed  from  the  ordered  probit  probabilities,  conditional  on  the  three  most 
recent  trades  being  buyer-initiated,  and  the  three  most  recent  price  chajiges  being  0  tick 
each,  for  International  Business  Machines  Corporation  (IBM  -  206,794  trades),  Abitibi- 
Price  Incorporated  (ABY  -  1,145  trades).  Quantum  Chemical  Corporation  (CUE  -  26,927 
trades),  Dow  Chemical  Company  (DOW  -  81,890  trades),  amd  First  Chicago  Corporation 
(FNB  -  17,783  trades),  and  Foster  Wheeler  Corporation  (FWC  -  18,199  trades),  for  the 
sample  period  from  4  January  1988  to  30  December  1988.  Percentage  price  impact  is 
computed  as  a  percentage  of  the  average  of  the  high  and  low  prices. 


$  Volume 

IBM 

ABY 

CUE 

DOW 

FNB 

FWC 

(Ticks) 

E  Zk]:        5,000 

-0.328 

-0.210 

-0.460 

-0.345 

-0.160 

-0.214 

AE[Z)t]:    10,000 

0.037 

0.026 

0.071 

0.047 

0.030 

0.021 

AE  Zk]:    20,000 

0.073 

0.051 

0.142 

0.094 

0.061 

0.045 

AE[Zk]:    50,000 

0.120 

0.084 

0.236 

0.154 

0.101 

0.080 

AE[Zk]:  100,000 

0.155 

0.109 

0.306 

0.200 

0.131 

0.111 

AEJZjfc]:  250,000 

0.200 

0.142 

0.398 

0.260 

0.170 

0.156 

AE[Zk]:  500,000 

0.234 

0.167 

0.468 

0.305 

0.201 

0.195 

(%  of  Price) 

E  Zfc  :        5,000 

-0.035 

-0.141 

-0.066 

-0.051 

-0.073 

-0.186 

AE  Zjt]:     10,000 

0.004 

0.017 

0.010 

0.007 

0.014 

0.018 

AE[Z;k  :     20,000 

0.008 

0.034 

0.020 

0.014 

0.028 

0.039 

AE[Z;t]:     50,000 

0.013 

0.057 

0.034 

0.023 

0.046 

0.070 

AE  Zk]:  100,000 

0.017 

0.074 

0.044 

0.029 

0.060 

0.096 

AE  Zk\:  250,000 

0.021 

0.096 

0.057 

0.038 

0.078 

0.136 

AE  Zf,]:  500,000 

0.025 

0.112 

0.067 

0.045 

0.092 

0.169 

9.4.4 
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Table  4  (Continued) 

Price  impact  of  trades  as  measured  by  the  change  in  conditional  mean  of  Zj^,  or  AE[Z^], 
when  trade  sizes  are  increased  incrementally  above  the  base  case  of  a  $5,000  trade.  These 
changes  are  computed  from  the  ordered  probit  probabilities,  conditional  on  the  three  most 
recent  trades  being  buyer-initiated,  and  the  three  most  recent  price  changes  being  0  tick 
each,  for  Handy  and  Harmaii  Company  (HNH  -  3,174  trades),  Navistar  International 
Corporation  (NAV  -  96,127  trades),  Reebok  International  Limited  (RBK  -  62,778  trades). 
Sears  Roebuck  and  Company  (S  -  94,127  trades),  and  American  Telephone  and  Telegraph 
Company  (T  -  180,726  trades),  for  the  sample  period  from  4  January  1988  to  30  December 
1988.  Percentage  price  impact  is  computed  as  a  percentage  of  the  average  of  the  high  and 
low  prices. 


$  Volume 

HNH 

NAV 

RBK 

S 

T 

(Ticks) 

E[Zjk  :        5,000 

-0.230 

-0.235 

-0.208 

-0.206 

-0.294 

AE[Za:]:     10,000 

0.018 

0.007 

0.019 

0.019 

0.013 

AE[Zfc]:    20,000 

0.038 

0.016 

0.042 

0.040 

0.028 

AE[Z)t]:    50,000 

0.070 

0.031 

0.077 

0.070 

0.050 

AE[Z;k]:  100,000 

0.098 

0.044 

0.110 

0.094 

0.069 

AE[Za:]:  250,000 

0.140 

0.066 

0.161 

0.129 

0.098 

AEJZjt]:  500,000 

0.177 

0.087 

0.207 

0.159 

0.123 

{%  of  Price) 

E[Zjk  :        5,000 

-0.175 

-0.534 

-0.182 

-0.066 

-0.135 

AE[Zjt]:     10,000 

0.014 

0.017 

0.017 

0.006 

0.006 

AE[Z]fc]:    20,000 

0.029 

0.037 

0.036 

0.013 

0.013 

AE[Zjt]:    50,000 

0.053 

0.070 

0.067 

0.022 

0.023 

AE  Zjfc]:  100,000 

0.074 

0.100 

0.096 

0.030 

0.032 

AE[Zjfc]:  250,000 

0.107 

0.151 

0.140 

0.041 

0.045 

AE[Z;t]:  500,000 

0.135 

0.197 

0.181 

0.051 

0.056 

9.4.4 


10.22.91 


Table  5 


Tests  of  equally  spaced  partition  boundaries  {o^}  from  the  ordered  probit  model  for  Inter- 
national Business  Machines  Corporation  (IBM  -  206,794  trades),  Abitibi-Price  Incorpo- 
rated (ABY  -  1,145  trades),  Quantum  Chemical  Corporation  (CUE  -  26,927  trades),  Dow 
Chemical  Company  (DOW  -  81,890  trades).  First  Chicago  Corporation  (FNB  -  17,783 
trades),  Foster  Wheeler  Corporation  (FWC  -  18,199  trades),  Handy  and  Harman  Com- 
pany (HNH  -  3,174  trades),  Navistar  International  Corporation  (NAV  -  96,127  trades), 
Reebok  International  Limited  (RBK  -  62,778  trades),  Sears  Roebuck  and  Company  (S  - 
94,127  trades),  and  American  Telephone  and  Telegraph  Company  (T  -  180,726  trades), 
for  the  sample  period  from  4  January  1988  to  30  December  1988.  Entries  in  the  column 
labelled  "m"  denote  the  number  of  states  in  the  ordered  probit  specification.  The  5  and  1 
percent  critical  values  of  a  Xi  random  vajiate  are  5.99  and  9.21,  respectively.  The  5  and  1 
percent  critical  values  of  a  x^  random  variate  are  12.6  and  16.8,  respectively. 


Sample 

Stock 

Size 

/    <*     2 

m 

IBM 

206,794 

15,682.35 

9 

ABY 

1,145 

11.94 

5 

CUE 

26,927 

366.41 

9 

DOW 

81,890 

2,057.79 

9 

FNB 

17,783 

537.42 

9 

FWC 

18,199 

188.28 

5 

HNH 

3,174 

30.59 

5 

NAV 

96,127 

998.13 

9 

RBK 

62,778 

2,138.16 

9 

S 

94,127 

2,487.80 

9 

T 

180,726 

1,968.39 

9 

9.4.5 
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TabUe 


Namei,  ticker  lymboli,  market  value*,  and  tamplc  iiaet  over  the  eample  period  from  4  January  1988  to  SO  December  1988  for 
100  randomly  selected  itocki  for  which  the  ordered  probit  model  waa  ettimatcd.  The  selection  procedure  involved  ranking  all 
companici  on  the  CRSP  daily  retumi  flie  by  b«ginninf-of-year  market  value  and  randomly  choosing  20  companies  in  each  of 
deciles  6  through  10,  discarding  companies  which  are  clearly  identified  as  equity  mutual  funds.  Asterisks  next  to  ticker  symbols 
indicate  those  securities  for  which  the  maximum  likelihood  estimation  procedure  did  not  converge. 


Ticker 
Symbol 

Company  Name 

Market  Value 
xSl.OOO 

Sample 
Site 

ACP 

BCL 

CUL 

DCY 

FCH 

GYK 

ITX 

LOM 

MCI* 

NET* 

NPK 

OCQ* 

OIL 

SII 

SKY 

SPF 

TOL 

WIC 

WJ 

XTR 

Decile  6 

AMERICAN  REAL  ESTATE  PARTNERS  L 

BIOCRAFT  LABS  INC 

CULLINET  SOFTWARE  INC 

D  C  N  Y  CORP 

FIRST  CAPITAL  HLDGS  CORP 

GUNT  YELLOWKNIFE  MINES  LTD 

INTERNATIONAL  TECHNOLOGY  CORP 

LOMAS  ti  NETTLETON  MTG  INVS 

MASSMUTUAL  CORPORATE  INVS  INC 

NORTH  EUROPEAN  OIL  RTY  TR 

NATIONAL  PRESTO  INDS  INC 

ONEIDA  LTD 

TRITON  ENERGY  CORP 

SMITH  INTERNATIONAL  INC 

SKYLINE  CORP 

STANDARD  PACIFIC  CORP  DE  L  P 

TOLL  BROTHERS  INC 

W  I  C  0  R  INC 

WATKINS  JOHNSON  CO 

XTRA  CORP 

217,181 
230,835 
189,680 
149,073 
169,088 
137,337 
161,960 
219,450 
169,390 
134,848 
193,489 
133,665 
196,815 
148,779 
145,821 
215,360 
167,463 
228,044 
192,648 
163,465 

2,394 
7,092 

18,712 
1,667 
8,899 
1,594 

14,675 
6,471 
727 
708 
1,222 
1,643 
3,203 
5,435 
5,804 

11,630 
6,619 
1,331 
1,647 
1,923 

CER 

CKL 

CTP 

DEI 

FDO 

FRM 

FUR 

KOG 

KWD 

LOG 

MGM 

NPR* 

OKE 

SFA 

SIX* 

SJM 

SPW 

SRR 

TGR 

TRN 

Decile  7 

CILCORP  INC 

CLARK  EQUIPMENT  CO 

CENTRAL  MAINE  POWER  CO 

DIVERSIFIED  ENERGIES  INC  DE 

FAMILY  DOLLAR  STORES  INC 

FIRST  MISSISSIPPI  CORP 

FIRST  UNION  REAL  EST  EQtMG  INVTS 

KOGER  PROPERTIES  INC 

KELLWOOD  COMPANY 

RAYONIER  TIMBERLANDS  L  P 

M  G  M  U  A  COMMUNICATIONS 

NEW  PLAN  RLTY  TR 

ONEOK  INC 

SCIENTIFIC  ATLANTA  INC 

MOTEL  6  LP 

SMUCKER  J  M  CO 

S  P  X  CORP 

STRIDE  RITE  CORP 

TIGER  INTERNATIONAL  INC 

TRINITY  INDUSTRIES  INC 

400,138 
408,609 
353,648 
896,505 
286,533 
306,931 
329,041 
266,815 
236,271 
302,500 
812,669 
376,332 
234,668 
263,801 
896,768 
378,931 
366,168 
245,218 
852,968 
457,366 

1,756 

11,580 

5,326 

3,411 

8,513 

8,711 

3,213 

3,508 

4,138 

2,670 

10,376 

1,983 

12,788 

16,853 

2,020 

762 

7,304 

6,767 

21,612 

18,219 

9.4.6 
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Table  6  (Continued) 


Ticker 
Symbol 

Company  Name 

Market  Value 
x»l,000 

Sample 
Site 

APS 

CAW 

CBT 

DDS 

ERB 

FSI 

FVB 

GLK 

HD 

HPH 

KU 

LAC 

NVP 

DDR 

PA 

PST 

REN 

SW  * 

TW 

USR 

Decile  8 

AMERICAN  PRESIDENT  COS  LTD 

CAESARS  WORLD  INC 

CABOT  CORP 

DILLARD  DEPARTMENT  STORES  INC 

ERBAMONT  N  V 

FLIGHT  SAFETY  INTL  INC 

FIRST  VIRGINU  BANKS  INC 

GREAT  LAKES  CHEM  CORP 

HOME  DEPOT  INC 

HARNISCHFEGER  INDUSTRIES  INC 

KENTUCKY  UTILITIES  CO 

LAC  MINERALS  LTD  NEW 

NEVADA  POWER  CO 

OCEAN  DRILLING  i  EXPL  CO 

PRIMERICA  CORP  NEW 

PETRIE  STORES  CORP 

ROLLINS  ENVIRONMENTAL  SVCS  INC 

STONE  it.  WEBSTER  INC 

T  W  SERVICES  INC 

UNITED  STATES  SHOE  CORP 

617,376 
625,828 
897,905 
758,327 
796,698 
833,456 
496,325 
938,358 
921,506 
469,921 
675,997 
921,456 
604,785 
849,965 
946,507 
730,688 
825,353 
499,568 
691,852 
618,686 

21,554 

17,900 

6,277 

7,267 

8,007 

4,562 

2,637 

6,982 

16,025 

7,573 

8,116 

4,900 

8,159 

4,694 

35,390 

12,291 

44,272 

847 

16,863 

24,991 

ABS 

BDX 

CCL 

CYR 

FFC 

FG 

GOU 

GWF 

MEA 

MEG 

MLL 

NSP 

PDQ 

PKN 

RYC 

SNG 

SPS 

TET 

WAG 

WAN 

Decile  9 

ALBERTSONS INC 

BECTON  DICKINSON  ii  CO 

CARNIVAL  CRUISE  LINES  INC 

CRAY  RESEARCH  INC 

FUND  AMERICAN  COS  INC 

V  ST  ii  G  CORP 

GULF  CANADA  RESOURCES  LIMITED 

GREAT  WESTERN  FINANCL«lL  CORP 

MEAD  CORP 

MEDU  GENERAL  INC 

MACMILLAN  INC 

NORTHERN  STATES  POWER  CO  MN 

PRIME  MOTOR  INNS  INC 

PERKIN  ELMER  CORP 

RAYCHEM  CORP 

SOUTHERN  NEW  ENGLAND  TELECOM 

SOUTHWESTERN  PUBLIC  SERVICE  CO 

TEXAS  EASTERN  CORP 

WALGREEN  COMPANY 

WANG  LABS  INC 

1,695,456 
2,029,188 
1,294,152 
2,180,374 
1,608,525 
2,163,821 
1,866,365 
1,932,755 
2,131,043 
1,002,059 
1,387,400 
1,852,777 
1,006,803 
1,088,400 
1,597,194 
1,397,070 
966,688 
1,146,380 
1,891,310 
1,801,476 

14,171 
17,499 

7,111 
26,459 

6,884 
66,848 

2,071 
20,705 
35,796 

6,304 
■22,083 
14,482 
11,470 
17,181 
16,680 

4,662 
10,640 
29,428 
23,684 
36,607 
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TabU  6  (Continued) 


Ticker 

Company  Name 

Market  Value 

Sample 

Symbol 

xSl.OOO 

Site 

Decile  10 

AN 

AMOCO  CORP 

7,745,076 

39,906 

BN 

BORDEN  INC 

3,671,366 

22,630 

BNI 

BURLINGTON  NORTHERN  INC 

4,644,263 

33,224 

BT 

BANKERS  TRUST  NY  CORP 

2,426,399 

18,502 

CAT 

CATERPILLAR  INC  DE 

6,137,566 

86,379 

CBS 

CBS  INC 

3,709,910 

18,630 

CCB 

CAPITAL  CITIES  ABC  INC 

5,581,410 

14,585 

CPC 

CPC  INTERNATIONAL  INC 

3,317,679 

27,852 

DUK 

DUKE  POWER  CO 

4,341,008 

17,918 

GCI 

GANNETT  INC 

6,335,081 

33,512 

GIS 

GENERAL  MILLS  INC 

4,378,513 

26,786 

MAS 

MASCO  CORP 

2,867,259 

25,746 

MHP 

MCGRAW  HILL  INC 

2,438,169 

36,047 

NT 

NORTHERN  TELECOM  LTD 

4,049,909 

10,128 

NYN 

NYNEX  CORP 

3,101,539 

40,514 

PCG 

PACIFIC  GAS  ii  ELEC  CO 

5,982,064 

93,981 

PFE 

PFIZER  INC 

7,693,452 

68,035 

RAL 

RALSTON  PURINA  CO 

4,517,751 

24,710 

SGP 

SCHERING  PLOUGH  CORP 

6,438,652 

34,161 

UCC 

UNION  CAMP  CORP 

2,672,966 

14,080 
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Table  7 


Summary  itatiitici  for  the  sample  of  100  randomly  choten  lecuritiei  for  the  lample  period  from  4  Janumry  1088  to  SO  December 
1988.  Note:  market  valuei  are  computed  at  the  beginning  of  the  year. 


9.4.7 


StatiiUc 

Decile 
6 

Decile 
7 

Decile 
8 

Decile 
0 

Decile 
10 

Low  Price  ($) 
Decile  Mean 
Decile  Std.  Dev. 

13.94 
9.14 

17.95 
0.75 

21.47 
12.47 

28.02 
12.95 

60.00 
62.27 

High  Price  (J) 
Decile  Mean 
Decile  Std.  Dev. 

21.11 
11.42 

27.25 
12.16 

33.61 
14.85 

41.39 
21.20 

77.56 
76.93 

Market  Value  x  $10* 
Decile  Mean 
Decile  Std.  Dev. 

0.177 
0.033 

0.333 
0.065 

0.726 
0.167 

1.602 
0.414 

5.563 
3.737 

%  Pricee  >  Midquote 
Decile  Mean 
Decile  Std.  Dev. 

40.68 
6.36 

41.47 
6.37 

41.77 
3.98 

42.53 
3.71 

43.65 
3.19 

%  Price«  =  Midquote 
Decile  Mean 
Decile  Std.  Dev. 

17.13 
3.99 

19.08 
3.67 

17.91 
4.61 

18.47 
3.93 

16.85 
2.97 

%  Pricea  <  Midquote 
Decile  Mean 
Decile  Std.  Dev. 

42.18 
4.03 

39.45 
4.77 

40.32 
4.30 

39.00 
3.80 

39.60 
2.15 

Mean(Zt) 

Decile  Mean 
Decile  Std.  Dev. 

0.0085 
0.0200 

0.0038 
0.0115 

0.0058 
0.0103 

-0.0006 
0.0054 

0.0015 
0.0065 

Mean(Att) 

Decile  Mean 
Decile  Std.  Dev. 

1,085.91 
512.59 

873.66 
489.01 

629.35 
431.70 

430.74 
330.26 

222.49 
109.14 

Mean(ABt) 

Decile  Mean 
Decile  Std.  Dev. 

2.1947 
0.5396 

2.3316 
0.4657 

2.4926 
0.3989 

2.5583 
0.6514 

2.9938 
1.6637 

Mean{S&P500i) 
Decile  Mean 
Decile  Std.  Dev. 

-0.0048 
0.0080 

-0.0037 
0.0035 

-0.0026 
0.0025 

-0.0020 
0.0019 

-0.0009 
0.0006 

Mean(IBSi) 

Decile  Mean 
Decile  Std.  Dev. 

-0.0150 
0.0987 

0.0202 
0.1064 

0.0145 
0.0695 

0.0353 
0.0640 

0.0395 
0.0455 

Mean(T>{Vt)IBSt) 
Decile  Mean 
Decile  Std.  Dev. 

3.9822 
17.9222 

0.1969 
0.6193 

0.0782 
0.3230 

0.2287 
0.3661 

0.3017 
0.2504 

Median  Trading  Volume  ($) 
Decile  Mean 
Decile  Std.  Dev. 

6,002 
2,728 

7,345 
3,136 

12,182 
4,985 

16,483 
10,074 

28,310 
13,474 

Mean  X 

Decile  Mean 
Decile  Std.  Dev. 

0.1347 
0.2579 

0.0710 
0.1517 

0.0127 
0.0451 

0.0230 
0.0679 

0.0252 
0.1050 

Table  8 


Price  impact  measures,  defined  as  the  increase  in  conditional  expected  price  change  given 
by  the  ordered  probit  model  as  the  volume  of  the  most  recent  trade  is  increased  from  a. 
base  case  of  $1,000  to  either  the  median  level  of  volume  for  each  security  or  a  level  of 
$100,000,  for  the  sample  of  100  randomly  chosen  securities  for  the  sample  period  from  4 
January  1988  to  30  December  1988.  Percentage  price  impact  measures  are  percentages  of 
the  average  of  the  high  and  low  prices  of  each  security. 


Price  Impact 
Measure 

Decile 
6 

Decile 
7 

Decile 
8 

Decile 
9 

Decile 
10 

In  Ticks,  V^-i  =  Median 
Decile  Mean 
Decile  Std.  Dev. 

0.0778 
0.0771 

0.0991 
0.0608 

0.1342 

0.0358 

0.1420 
0.0532 

0.2020 
0.0676 

In  %,  Vk-l  =  Median 
Decile  Mean 
Decile  Std.  Dev. 

0.0612 
0.0336 

0.0600 
0.0286 

0.0703 
0.0207 

0.0583 
0.0229 

0.0523 
0.0262 

In  Ticks,  V]t_i  =  $100,000 
Decile  Mean 
Decile  Std.  Dev. 

0.2240 
0.1564 

0.2611 
0.1174 

0.2620 
0.0499 

0.2521 
0.0617 

0.2849 
0.0804 

In%,  Vfc_l  =  $100,000 
Decile  Mean 
Decile  Std.  Dev. 

0.2250 
0.1602 

0.1660 
0.0745 

0.1442 
0.0570 

0.1148 
0.0633 

0.0778 
0.0383 
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Table  9 


Cross-sectional  regressions  for  Box-Cox  parameters  \-  and  price  impact  measures  for  the 
sample  of  100  randomly  chosen  securities,  of  which  94  are  included  in  the  regression  since 
the  maximum  likelihood  estimation  procedure  did  not  converge  for  the  omitted  6,  for  the 
sample  period  from  4  January  1988  to  30  December  1988.  All  the  coefficents  have  been 
multiplied  by  a  factor  of  1,000.  Z-statistics  are  given  in  parentheses,  each  of  which  is 
asymptotically  distributed  as  iV(0, 1)  under  the  null  hypothesis  that  the  corresponding 
coefficient  is  zero. 


Dependent 

Constant 

Market 

Initial 

Medicin 

Median 

r' 

Variable 

Value 

Price 

Volume 

Atk 

k 

118.74 

-2.08 

-7.42 

-8.39 

-2.55 

-0.008 

(2.11) 

(-0.31) 

(-1.35) 

(-1.04) 

(-0.33) 

Price  Impact  (Ticks) 

93.82 

9.86 

1.76 

5.25 

-2.31 

0.184 

V;t-i  =  Median 

(3.72) 

(3.27) 

(0.71) 

(1.45) 

(-0.66) 

Price  Impact  {%) 

36.07 

-1.19 

-2.31 

6.66 

0.67 

0.376 

Vjt_i  =  Median 

(4.46) 

(-1.23) 

(-2.92) 

(5.72) 

(0.60) 

Price  Impact  (Ticks) 

265.34 

8.07 

-5.64 

-3.59 

3.25 

0.003 

Vjt_i  =  $100,000 

(7.03) 

(1.79) 

(-1.52) 

(-0.66) 

(0.62) 

Price  Impact  {%) 

138.52 

-8.53 

-9.61 

8.53 

1.74 

0.221 

Vfc_i  =  $100,000 

(4.17) 

(-2.15)    , 

(-2.95) 

(1.78) 

(0.38) 
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Table  10 

Spearman  rank  correlations  of  the  Box-Cox  pau-ameters  \-  and  price  impact  measures  with 
market  value,  initial  price,  mediaji  volume,  and  median  trade  times  for  the  sample  of  100 
randomly  chosen  securities,  of  which  94  are  used  since  the  maximum  likelihood  estimation 
procedure  did  not  converge  for  the  omitted  6,  over  the  sample  period  from  4  January  1988 
to  30  December  1988.  Under  the  null  hypothesis  of  independence,  each  of  the  correlation 
coefficients  are  asymptotically  normal  with  mean  0  and  variance  l/(n  —  l),  hence  the  two 
standard  error  confidence  interval  for  these  correlation  coefficients  is  [  —0.207  ,  0.207  ], 


Market 

Initial 

Median 

Median 

Value 

Price 

Volume 

At 

K 

-0.260 

-0.503 

-0.032 

-0.015 

Price  Impact  (Ticks) 

0.604 

0.678 

0.282 

-0.360 

V^_l  =  Median 

Price  Impact  {%) 

-0.156 

-0.447 

0.486 

0.082 

Vjt_i  =  Median 

Price  Impact  (Ticks) 

0.273 

0.329 

-0.020 

-0.089 

Vjt_i  =  $100,000 

Price  Impact  {%) 

-0.547 

-0.815 

0.088 

0.316 

Vjt_i  =S100,000 

9.4.10 


10.22.91 


'if) 

c 


Pi    y 

P3 

P4 

P5 

P6 

P7 

PaX 

N        P9 

Of,     Ot2         «3         O'A  05     «6  «7  "8 


Figure  1. 


Illustration  of  ordered  probit  probabilities  Pj  which  are  determined  by  the  a^■'s  and  the 
distribution  of  Z^.  In  particular,  p^  =  Prob(Z  =  5^)  =  Prob(a{_i  <  Z*  <  o^),  t  =  1,...,9 
where,  for  notational  simplicity,  we  define  ag  =  —  00  and  0:9  =  +00. 
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Kelogram  ol  Price  Changes  -  ABY 


Histogram  of  Time  Between  Trades  -  IBM 
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Histogram  of  Dollar  Volume  -  IBM 
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Figure  2. 
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Hisiogram  of  Price  Changes  -  CUE 
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Figure  2  (Continued). 
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Katogram  of  Pric«  Change*  -  F7<IB 
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Histogram  of  Dollar  Volume  -  FNB 
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Figure  2  (Continued), 
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Huiogram  of  Price  Changes  -  RBK 
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Figure  2  (Continued). 
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Hisiogram  of  Price  Changes  -  HNH 
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Kelogram  of  Price  Change*  -  NAV 
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Figure  2  (Continued). 
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Figure  2  (Continued). 
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Figure  3. 
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Figure  3  (Continued). 
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Figure  4. 


Percentage  price  impact  as  a  function  of  dollar  volume  computed  from  ordered  probit  probabilities,  condi- 
tional on  the  three  most  recent  trades  beiiig  buyer-initiated,  and  the  three  most  recent  price  changes  being  +1 
tick  each,  for  IBM  (206,794  trades),  ABY  (1,145  trades),  CUE  (26,927  trades),  DOW  (81,890  trades),  FNB 
(17,783  trades),  FWC  (18,199  trades),  HNH  (3,174  trades),  NAV  (96,127  trades),  RBK  (62,778  trades), 
S  (94,127  trades),  T  (180,726  trades),  for  the  sample  period  from  4  January  1988  to  30  December  1988. 
Percentage  price  impact  is  measured  as  a  percentage  of  the  average  of  the  high  and  low  prices  for  each  stock. 
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Figure  5. 
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Figure  5  (Continued). 
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